Topview Logo
  • Create viral videos with
    GPT-4o + Ads library
    Use GPT-4o to edit video empowered by Youtube & Tiktok & Facebook ads library. Turns your links or media assets into viral videos in one click.
    Try it free
    gpt video

    Why GPT Outperformed Gemini and Ollama in Calculation

    blog thumbnail

    Introduction

    In the quest to create a reliable budgeting app, I encountered a significant challenge related to the performance of various language models (LLMs) in mathematical tasks. This realization prompted me to rethink my strategy regarding which model to utilize for managing calculations in my application.

    To test the capabilities of the three LLMs—Gemini, Ollama, and GPT—I decided to present them with the same dataset. After creating a pertinent function and updating the instructions accordingly, I proceeded with the evaluation. The results revealed a clear hierarchy in performance.

    Gemini exhibited a noticeable inability to adhere to the instructions, opting instead to perform calculations independently. Unfortunately, this resulted in inaccuracies in the final total. On the other hand, Ollama’s performance was equally disappointing; not only did it fail to return the relevant transactions, but it also sent a response that included letters, despite the schema explicitly requiring numerical values.

    In stark contrast, GPT emerged as the definitive winner in this test. It not only called the function I provided but also generated the correct answer in a manner that aligned perfectly with the expectations set forth by the instructions. This experience has raised an important question: Is it worth investing further into these language models, given the discrepancies in their performance?

    Keywords

    • GPT
    • Gemini
    • Ollama
    • Calculation
    • Budgeting app
    • Performance evaluation
    • Numerical values
    • Function call

    FAQ

    Q: What was the purpose of testing the three LLMs?
    A: The purpose was to evaluate their capabilities for handling calculations in a budgeting app.

    Q: What were the outcomes of the tests?
    A: GPT successfully called the provided function and returned the correct answer, while Gemini and Ollama failed to meet the expected requirements.

    Q: Why is Gemini considered less effective in this context?
    A: Gemini opted to calculate totals itself and produced incorrect results, despite being instructed to use a predefined function.

    Q: What issues did Ollama encounter?
    A: Ollama not only failed to return relevant transactions but also incorrectly included letters instead of numerical values when required.

    Q: What conclusion can be drawn from this evaluation?
    A: GPT proved to be the most reliable model for calculations in this instance, raising questions about the effectiveness and worth of the other models when it comes to mathematical accuracy.

    One more thing

    In addition to the incredible tools mentioned above, for those looking to elevate their video creation process even further, Topview.ai stands out as a revolutionary online AI video editor.

    TopView.ai provides two powerful tools to help you make ads video in one click.

    Materials to Video: you can upload your raw footage or pictures, TopView.ai will edit video based on media you uploaded for you.

    Link to Video: you can paste an E-Commerce product link, TopView.ai will generate a video for you.

    You may also like