Just when you started coming to terms with ChatGPT's eerie capabilities, OpenAI dropped a new version of its AI language model.
OpenAI says GPT-4 is much more advanced than GPT-3, which powers ChatGPT. And to prove it, they made GPT-4 sit down for a bunch of exams. OpenAI tested GPT-4 with a variety of standardized tests from high school to graduate to professional level and spanning across mathematics, science, coding, history, literature, and even the one you take to become a sommelier. The exams were comprised of multiple choice and free-response question and GPT-4 was scored using the standard methodology for each exam.
SEE ALSO: How to get access to GPT-4 right nowPut your pencil down, GPT-4, it's time to see check your scores.
GPT-4 didn't just get into law school, it passed the bar. The AI language model scored in the 88th percentile on the LSATs (Law School Admission Test) and did even better on the Bar (Uniform Bar Exam) by scoring in the 90th percentile. By comparison, GPT-3 was in the bottom 40 percent of the LSATs and 10 percent on the Bar.
GPT-4 took both the math and reading/writing sections of the SATs and all three sections of the GREs which are broken down into quantitative, verbal, and writing skills. It scored in the 80th or 90th percentile of all sections except for the writing section of the GREs... which it kind of bombed in the 54th percentile.
The quintessential overachiever, GPT-4 also took allthe AP (Advanced Placement) high school exams. It aced most of them, scoring between the 84th and 100th, except for a few outliers.
GPT-4 scored 44th in AP English Language and a measly 22nd in AP English Literature. So all you wordsmiths out there might have some more time before GPT-4 replaces you. GPT-4 didn't do so hot on AP Calculus BC scoring between 43rd and 59th, proving that even for a supercomputer, calculus is not easy. But that still earns GPT-4 a four, so it might still place out of college calculus.
GPT-4 still has some work to do with its coding skills, which is curious since one of its marketed uses is for helping developers. Its rating for Codeforces, which hosts competitive programming events, is 392, which puts it way down in the Newbie category of anything below 1199.
It did pretty well on the easy level of the Leetcode (31 out of 41 problems solved) but struggled when it came to medium or hard level of difficulty (21/80 and 3/45 respectively). As we saw in the developer demo livestream, GPT-4 is fully capable of writing Python, but required some manual tweaking to set the right parameters, which might explain some these test scores. Or maybe it didn't eat breakfast that morning.
GPT-4 passed the sommelier exams with flying colors. It placed lowest (77th percentile) in the most advanced sommelier exam. But for a non-human entity that's never tasted wine, we'll let that one slide.
OpenAI has released a full breakdown of how GPT-4 performed. GPT-4 might not write the next great American novel...yet, but GPT-4's future as a mathematically brilliant lawyer and wine connoisseur looks pretty bright.
文章
266
浏览
75
获赞
7
UK bans new Huawei 5G network gear from September
The UK's big play against Huawei all comes to a head next September. Mobile providers will not be alKamala Harris' reactions to Mike Pence at the VP debate are all you need to see
Kamala Harris' face has gone on QUITE the journey.On Wednesday night, Senator Harris and Vice PresidNetflix adds short clips feature for kids, because everything is TikTok now
Netflix is aiming to draw in younger viewers with a new TikTok-esque feature called "Kids Clips." SiApple's silence on AirPods is doing wonders for wireless earbud rivals
As recently as 2019, AirPods were unquestionably the best wireless earbuds in the game. Those telltaMan released from coronavirus quarantine can't stop coughing during Fox News interview
A man who was freshly out of quarantine for the coronavirus has raised eyebrows after he began coughLooking to save on a Tesla? 9 tips for buying a used electric vehicle.
Like shopping for any used car, you’ll want to check an electric vehicle’s crash and repThe notch on Apple's new MacBook Pro is causing some funny glitches
"No way."That was my very first thought upon seeing this videofrom tech YouTuber, Quinn Nelson of SnThe best new FaceTime features on iOS 15
FaceTime is already the preferred video chat platform for most iPhone users. But with the latest iOS'Jeopardy!' contestant invokes Chaka Khan's name in a hilariously wrong answer
A Jeopardy! contestant's gaffe is going viral for being sohilariously off the mark. During Monday niCut his mic, don't kill the debates: What Trump's terrible night taught us
There is, so far as we know, only one moment in Donald Trump's long and privileged life in which heJohn Legend celebrates Biden passing Trump in Georgia with the perfect cover
It's finally happened.A little after 4 a.m. ET on Friday, and after hours of steadily closing the ga12 best tweets of the week, including beans and Keanu Reeves
Summer is in the books, the pandemic's still raging, and, well, we've got good tweets.We've been colThe new White House website includes a hidden message
Come for a long-overdue acknowledgment of the ongoing "climate emergency," stay for the job offer. AMatch launches Dates, a feature to help navigate dating during the pandemic
We can't necessarily call these times "unprecedented" anymore, given that the coronavirus pandemic hAhead of Apple's Sept. 14 event, new iPhone and AirPods details emerge
Yeah, Apple's annual September reveal event is a mere two days away at this point. But why wait for