By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
vantagefeed.comvantagefeed.comvantagefeed.com
Notification Show More
Font ResizerAa
  • Home
  • Politics
  • Business
  • Tech
  • Health
  • Environment
  • Culture
  • Caribbean News
  • Sports
  • Entertainment
  • Science
Reading: Not all AI prompts are worthy of thinking of multiple seconds: How meta teaches models that prioritize
Share
Font ResizerAa
vantagefeed.comvantagefeed.com
  • Home
  • Politics
  • Business
  • Tech
  • Health
  • Environment
  • Culture
  • Caribbean News
  • Sports
  • Entertainment
  • Science
Search
  • Home
  • Politics
  • Business
  • Tech
  • Health
  • Environment
  • Culture
  • Caribbean News
  • Sports
  • Entertainment
  • Science
Have an existing account? Sign In
Follow US
vantagefeed.com > Blog > Technology > Not all AI prompts are worthy of thinking of multiple seconds: How meta teaches models that prioritize
Not all AI prompts are worthy of thinking of multiple seconds: How meta teaches models that prioritize
Technology

Not all AI prompts are worthy of thinking of multiple seconds: How meta teaches models that prioritize

Vantage Feed
Last updated: February 5, 2025 6:38 pm
Vantage Feed Published February 5, 2025
Share
SHARE

Participate in the new and weekly newsletter for the latest updates and exclusive content on the leading AI coverage in the industry. learn more


There are problems with inference models such as Openai O1 and Deepseek-R1. Ask a simple question such as “What is 1+1?” And they think for a few seconds before answering.

Ideally, like humans, AI models should be able to tell the time of direct answer and when to spend extra time and resources to infer the response. a New technique Announced by researchers Meta AI and Illinois University Chicago Train a model to assign a budget budget based on the difficulty of queries. This improves the response, reducing costs, and allocating computing resources.

DeepSeek solution 1+1

Expensive inference

Large -scale language models (LLMS) generate long inference chains, which are often called “ideas” (cot), can improve the performance of inference issues. With the success of COT, the problem has led to the full range of inference time scaling methods that have “thought” for a long time, created multiple answers, and encourage the optimal answer to select the optimal answer.

One of the main methods used in the inference model is to generate multiple answers and select the most frequently repeated answers, also known as “many votes” (MV). The problem of this approach is that the model adopts a uniform operation, treats all prompts as a difficult reasoning problem, and spends multiple answers by spending unnecessary resources.

Smart inference

The new paper proposes a series of training techniques that make response models more efficient. The first step is “sequential voting” (SV). Here, the model stops the reasoning process as soon as the answer is displayed in a certain number. For example, a model will generate up to eight answers and select at least three answers. If the model is given the simple query above, the first three answers will probably be similar.

Their experiments show that when SV generates the same number of answers, the problem of mathematics competition exceeds the classic MV. However, SV requires additional instructions and tokens, and is equivalent to MV from the token vs. ackali ratio.

SV exceeds the MV by the number of responses, but matches the number of tokens (Source: Arxiv)

The second method, “Adaptive Sequental Voting” (ASV), promotes the model to examine the problem, and improves SV by generating multiple answers only when the problem is difficult. In the case of simple problems (such as 1+1 prompt), the model only generates a single answer without executing the voting process. This makes the model more efficient by processing both simple and complex issues.

Enhanced learning

Both SV and ASV improve the efficiency of the model, but requires a lot of treble data. In order to reduce this problem, researchers have a reinforced learning algorithm that teaches models to adjust the length of inference traces based on the difficulty of queries (IBPO). I will propose.

IBPO is designed to optimize the response while LLMS is within the range of inference budget restrictions. With the RL algorithm, the model is constantly generated an ASV trace, evaluates the response, and selects the correct answer and the results of providing the optimal inference budget, so that the profits obtained through manually labeled data training. It goes up.

Their experiments indicate that IBPO improves the front of the parate. This means that the IBPO trained model is better than other bass lines in the fixed inference budget.

IBPO (green circle) is superior to other base lines on the parate front (source: ARXIV)

The survey results are contrary to the background of researchers warning that the current AI model is hitting the wall. Companies are struggling to find high -quality training data and are looking for an alternative to improve the model.

One of the promising solutions is a reinforcing learning that gives the model a purpose and can find a unique solution in contrast to the monitored fine -tuning (SFT). The model is trained in a manually labeled example.

Surprisingly, this model often finds solutions that humans do not consider. This is a style that seems to be working well for Deepseek-R1 and is challenging the rule of the AI ​​lab based in the United States.

Researchers point out that “prompt -based and SFT -based methods are suffering from both absolute improvement and efficiency, and SFT alone supports the speculation that self -revision function is not enabled. This observation result. Is also partially supported by simultaneous work, suggesting that such self -correction behavior is automatically appeared in the RL, rather than a prompt or SFT. Masu.”

Daily insights on business use case in VB every day

If you want to impress your boss, VB Daily covers it. From regulatory shifts to actual development, we provide internal scoops about what companies are doing in the generated AI, so you can share the largest ROI insight.

Please read the privacy policy

Thank you for subscribing. Please see this VB Newsletter.

An error has occurred.

You Might Also Like

Best Early Prime Day PlayStation 5 Deal: My 31 Favorite Sale is Live Now

The best keyboards for 2025

How to watch Man City vs. Al Ain for free: Streaming FIFA Club World Cup Soccer

Elon Musk’s Trillion Dollar Robotaxi Gamble is here

How to clean your hearing aids

TAGGED:MetamodelsmultipleprioritizepromptssecondsteachesThinkingworthy
Share This Article
Facebook Twitter Email Print
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Follow US

Find US on Social Medias
FacebookLike
TwitterFollow
YoutubeSubscribe
TelegramFollow

Weekly Newsletter

Subscribe to our newsletter to get our newest articles instantly!

Subscribe my Newsletter for new posts, tips & new Articles. Let's stay updated!

Popular News
Trump’s tariffs are how everything works now
Technology

Trump’s tariffs are how everything works now

Vantage Feed Vantage Feed April 4, 2025
Apple will either approve Fortnite or pressure it to return to court
Today’s NYT Connection Tip, Answers for September 4, #451
Optimizing your data strategy with Databricks: Latest analysis
DIDI Review | Shawn Wang travels between two worlds to deliver a moving coming-of-age story
- Advertisement -
Ad imageAd image
Global Coronavirus Cases

Confirmed

0

Death

0

More Information:Covid-19 Statistics

Importent Links

  • About Us
  • Privacy Policy
  • Terms of Use
  • Contact
  • Disclaimer

About US

We are a dedicated team of journalists, writers, and editors who are passionate about delivering high-quality content that informs, educates, and inspires our readers.

Quick Links

  • Home
  • My Bookmarks
  • About Us
  • Contact

Categories & Tags

  • Business
  • Science
  • Politics
  • Technology
  • Entertainment
  • Sports
  • Environment
  • Culture
  • Caribbean News
  • Health

Subscribe US

Subscribe my Newsletter for new posts, tips & new Articles. Let's stay updated!

© 2024 Vantage Feed. All Rights Reserved.
Welcome Back!

Sign in to your account

Lost your password?