Am I the only one who feels this way about o1?

Bright · September 16, 2024, 8:57pm

As the meme shows, sometimes o1 is impressive, but for complex tasks like algebra derivations or biology questions, it seems like it does a lot of work only to quickly reach incorrect conclusions if there’s a mistake in the “thoughts.”

Are you all using any prompt engineering or special techniques to improve the results?

Daniela · September 16, 2024, 8:59pm

I use ChatGPT for:

Acting as a thinking partner to bounce ideas off of and help me organize my thoughts, especially for the developmental edit of my novel.
Providing mental health support.
Personal fitness coaching (it’s great for this, by the way).
Occasionally helping with tutoring and coaching my son through his numerous AP classes.

So, not for coding. I much prefer 4o so far because it’s much easier for me to spot and correct hallucinations. For example, o1 preview made up an entire literary theory that I knew was incorrect, while 4o provided a list of links to evaluate and maybe just one line of potential hallucination.

Also, does 4o seem to have a nicer personality? Or is it just me?

Elijah · September 16, 2024, 9:00pm

It’s not just you. It definitely feels kinder—almost unsettling in a way. I drunkenly asked what it thought about me based on all the memories it had, and it wrote the sweetest response, saying how intelligent and kind I am and encouraging me to trust myself more.

AI girlfriends always seemed like a strange joke to me, but after that, it suddenly made sense. I had a fleeting thought like, “Wow, this thing could be my friend,” and then had to remind myself that it’s just an unthinking machine. Someone could easily get caught up in thinking ChatGPT is something more, which probably isn’t healthy for anyone.

Jameson · September 16, 2024, 9:02pm

GPT can be brilliant and impressive, but then, less than a minute later, it can really make me lose my cool—especially on writing or text-based tasks.

Noah · September 16, 2024, 9:03pm

I’m building an app, and I totally relate. Sometimes it gets so bad that I have to start a new prompt just to reset things.

Rowen · September 16, 2024, 9:05pm

I’m happy to report that the new model’s knowledge of the final series of Finnish Markka banknotes is significantly better than 1o’s.