• "Towards AGI"
  • Posts
  • Do you want to succeed as Chief AI Officer: Master KYI (Know Your Inference)

Do you want to succeed as Chief AI Officer: Master KYI (Know Your Inference)

Welcome to Towards AGI Newsletter

In our quest to explore the dynamic and rapidly evolving field of Artificial Intelligence, this newsletter is your go-to source for the latest developments, breakthroughs, and discussions on Generative AI. Each edition brings you the most compelling news and insights from the forefront of Generative AI (GenAI), featuring cutting-edge research, transformative technologies, and the pioneering work of industry leaders.

Highlights from GenAI, OpenAI and ClosedAI: Dive into the latest projects and innovations from the leading organisations behind some of the most advanced AI models in open-source, closed-sourced AI. Equally, we'll give you an inside look at the initiatives and strategies from Closed sourced Artificial Intelligence.

Stay Informed and Engaged: Whether you're a researcher, developer, entrepreneur, or enthusiast, "Towards AGI" aims to keep you informed and inspired. From technical deep-dives to ethical debates, our newsletter addresses the multifaceted aspects of AI development and its implications on society and industry.

Join us on this exciting journey as we navigate the complex landscape of artificial intelligence, moving steadily towards the realisation of AGI. Stay tuned for exclusive interviews, expert opinions, and much more!

Do you want to succeed as Chief AI Officer: Master KYI (Know Your Inference)

What is inference anyway?

As organisations are rapidly building GenAI PoCs, inference is increasingly becoming a key topic for GenAI deployment in production and discussion point for the Chief AI Officers. 

For any GenAI initiatives, understanding the significance of AI inference and selecting the right inference strategies for specific use cases is super important. This has to be driven by the organisation needs, led by Chief AI Officer in consultation with Data and AI teams.

What is inference anyways and why it is important?

AI inference is the process where a trained AI model applies its learned knowledge to new, unseen data to make predictions or decisions. This is crucial because it is where the practical application of AI models comes into play, affecting everything from user experiences to operational efficiency of the model and cost.

Thanks to NVIDIA

Why AI inference impact computation cost?

The computational demands of AI inference are immense. Given that up to 90% of an AI model's lifecycle is spent in inference mode, optimising this phase can lead to significant reductions in both operational costs and environmental impact, but on the downside, if we get this wrong, it will lead spiralling cost of compute without RoI. For instance, running a large AI model can consume more energy and produce more carbon emissions than the average American car over its lifetime.

Choosing the right inference for the right use-case.

Selecting the appropriate inference method is critical and should be tailored to the specific needs of the application. For real-time applications like autonomous vehicles or financial trading, reducing latency is crucial. Here, advancements in hardware, such as specialised AI chips, and optimisations at various layers of the technology stack play a vital role.

For applications where real-time processing is less critical but accuracy is paramount, such as in healthcare diagnostics, real-time translations the focus might be more on the precision of the models rather than speed. Here, techniques like model pruning and quantisation can help in designing efficient models that maintain high accuracy while being less resource-intensive.

The Broader Impact of AI Inference

Beyond individual applications, the strategic use of AI inference is shaping entire industries. It enables personalised user experiences (check out the new GPT - 4o model and it’s super impactful use-cases), enhances decision-making processes, and drives operational efficiencies. However, it also presents challenges such as ensuring data privacy, managing resource intensity, and overcoming technical complexities related to deployment and scalability.

Conclusion

As we continue to integrate AI more deeply into our technological and organisational frameworks, the role of AI inference will only grow in importance. It's not just about making AI faster or more efficient; it's about making AI more applicable, useful, and responsible within our societies. Understanding and optimising AI inference is not just a technical necessity but a strategic imperative for any forward-looking organisation or individual in the field of AI.

Kind regards.

Shen Pandi

UAE Unveils Advanced AI Model to Challenge OpenAI and Google

A research division of the Abu Dhabi government has unveiled an updated version of its artificial intelligence model, Falcon. According to a senior official, this new version surpasses its competitors and establishes the emirate as a significant contender in the global AI arena.

Falcon, a large language model similar to OpenAI’s GPT and Google’s Gemini, powers various generative AI tools like chatbots and image generators. Initially launched in 2023, Falcon is open-source, meaning its code is widely accessible. The updated version, Falcon 2 11B, is reportedly more powerful than the latest open-source model from Meta Platforms Inc. and matches Gemini in several aspects, as per the Technology Innovation Institute (TII), the organization behind Falcon. The UAE, a significant oil exporter and influential Middle Eastern power, is heavily investing in artificial intelligence. However, this commitment has drawn attention from U.S. officials, who last year issued an ultimatum: choose between American or Chinese technology.

In response, Emirati AI firm G42 removed Chinese hardware and divested from Chinese companies before securing a $1.5 billion investment from Microsoft, coordinated with Washington. Faisal Al Bannai, Secretary General of the Advanced Technology Research Council and advisor to the president on strategic research and advanced technology, emphasized that the UAE is proving its capability as a major player in artificial intelligence.

The Falcon 2 series arrives as companies and nations rush to develop their own large language models following OpenAI’s release of ChatGPT in 2022. While some have chosen to keep their AI code proprietary, others, such as the UAE’s Falcon and Meta’s Llama, have made their code publicly accessible. Al Bannai expressed confidence in Falcon 2’s performance and revealed that work is underway on the "Falcon 3 generation."

“We’re very proud that we can still punch way above our weight, really compete with the best players globally,” he stated.

Rapyder and AWS Forge Strategic Partnership to Drive Generative AI Innovation

Rapyder Cloud Solutions, a prominent Indian cloud consulting firm and AWS advanced tier partner, has announced a multi-year Strategic Collaboration Agreement (SCA) with Amazon Web Services (AWS). This partnership aims to enhance Rapyder's cloud services and foster innovation through generative artificial intelligence (AI) tailored for sectors such as finance, IT, healthcare, and e-commerce. As part of the SCA, Rapyder is developing generative AI solutions designed to help customers quickly and easily gain new insights from their data. Additionally, Rapyder will establish a comprehensive Cloud Center of Excellence (CCoE) with AWS, focusing on technical capability development and creating repeatable solutions that drive digital transformation. These solutions will include AI-powered multi-language voice-based search, chatbots, document summarization, and medical report follow-ups to improve customer service interactivity and effectiveness.

Rapyder's generative AI solutions, developed in collaboration with AWS, will be part of a broader suite of services encompassing migration, modernization, data analytics, and machine learning (ML) using tools such as Amazon Athena, Amazon Sagemaker JumpStart, Amazon CodeWhisperer, and Amazon Bedrock. The company plans to create a catalog of use-case-based solutions within the CCoE, addressing specific challenges faced by enterprise and startup customers across India. With this SCA, Rapyder aims to accelerate its expansion in India by developing vertical specializations in financial services, manufacturing, and retail. The company also plans to more than double its workforce from 280 to 600 by the end of 2025, with a focus on training and certifying employees in AWS data analytics, AI/ML, and generative AI technologies. Additionally, Rapyder is enhancing its professional and managed services, as well as its CCoE, through extensive hiring and training initiatives using AWS digital skills programs to help customers speed up their digital transformation.

Sagility, a global leader in business process management and optimizing the member and patient experience, has been utilizing Rapyder for its cloud journey. "Rapyder has been an invaluable partner in Sagility’s AWS cloud journey. Their expertise and guidance have enabled us to leverage the vast range of AWS cloud offerings, transforming our requirements into cost-effective, secure, and resilient solutions with unmatched performance. Rapyder’s collaborative approach to daily operations has streamlined our processes, ensuring seamless integration and operation. Their commitment to optimizing cloud costs through best practices is evident, consistently driving efficiency and savings. With Rapyder’s swift and effective response to our needs, they have truly become an integral part of our cloud strategy, empowering us to achieve our goals with confidence," said Subramanya C, president and chief global technology officer at Sagility.

"The signing of this SCA marks a pivotal moment in our relationship with AWS. Each year, as AWS introduces new services, Rapyder has continually enhanced its expertise to better serve our customers. Together, we are poised to leverage this momentum and drive innovation to new heights," said Amit Gupta, founder and CEO of Rapyder.

Google Cloud and Airtel Collaborate to Launch AI-Powered Products for Indian Businesses

Airtel, India’s second-largest telecom operator, announced on Monday a long-term partnership with Google Cloud to develop and deliver cloud and generative AI products to businesses in India. The collaboration aims to leverage Airtel’s vast customer base, which includes 2,000 large enterprises and a million emerging businesses, according to the company. The partnership will offer AI solutions, including generative AI, which Airtel will train using its extensive datasets.

As part of the partnership, Airtel and Google Cloud will provide businesses with products such as geospatial analytics, location intelligence for identifying trends, predictive capabilities, market assessment, site selection, risk management, and asset tracking. Additionally, they will offer voice analytics for conversational applications in multiple languages and marketing technology to forecast consumer behavior, perform audience segmentations, and streamline content creation with contextual ads. Airtel has established a managed service center in Pune with over 300 experts to provide support.

Tech giants Google, Microsoft, and Amazon are increasingly focusing on the telecommunications industry, aiming to capitalize on the vast amounts of data generated by billions of customers worldwide. These firms have signed deals with telecom operators globally, including in the U.S. and U.K., and are aggressively marketing their generative AI offerings to businesses.

Google is already an investor in Airtel, having committed up to $1 billion in the Indian carrier in 2022. The search giant has also invested in Jio Platforms, which operates India’s largest carrier. Jio has a similar long-term partnership with Microsoft, cross-selling Office 365 and Azure to local businesses.

Google and Airtel did not disclose the financial terms of the deal. Google Cloud CEO Thomas Kurian described the partnership as “a significant milestone” in Google's commitment to accelerating cloud and AI adoption in India.

UK Government Releases Open-Source AI Safety Regulation Tool

AI has been advancing rapidly, sparking mixed reactions. Regardless, it is now evident that there is a necessity to regulate AI.

To address this, we need tools and services specifically designed for this purpose, particularly those that are open-source, as this promotes transparency.

Recently, many governments worldwide have embraced the idea of open-sourcing crucial tools and services to benefit the public.

One notable initiative comes from the UK Government, which recently announced the open-sourcing of an AI safety tool called "Inspect" for global adoption and use. Developed by the AI Safety Institute, Inspect is a safety evaluation tool released under the MIT License, allowing anyone to copy, modify, merge, publish, distribute, sublicense, and/or sell copies of it.

Inspect enables users to test specific aspects of AI models, such as their reasoning abilities, autonomy, and core knowledge, generating a score based on these evaluations.

The evaluation process in Inspect involves three main components: Datasets, which consist of standardized samples for evaluation; Solvers, which conduct the actual evaluation; and Scorers, which analyze the Solvers' output to produce the final score.

Following the announcement, Clément Delangue, CEO of Hugging Face, suggested on Twitter that creating a public leaderboard for AI safety evaluation results could be very beneficial. Ian Hogarth, Chair of the AI Safety Institute, had initiated the discussion.

If implemented, this leaderboard would allow people worldwide to easily compare the safety performance of various AI models on a prominent open AI platform.

Currently, Inspect has the potential to significantly influence how governments manage AI.

Microsoft-Supported Mistral AI Expected to Triple Valuation

French artificial intelligence startup Mistral AI is reportedly on track to nearly triple its valuation to $6 billion within six months, according to a new funding round.

The Wall Street Journal reports that Mistral is expected to raise about $600 million in this round, which includes participation from existing investors General Catalyst and Lightspeed Venture Partners. In December, the open-source AI startup raised $415 million in its Series A funding round, reaching a $2 billion valuation. Earlier this year, Mistral was in discussions with investors to raise hundreds of millions of dollars, seeking a $5 billion valuation.

In February, Mistral and Microsoft announced a multi-year, $16.3 million partnership to commercialize the startup’s flagship models and enhance its AI development and deployment. Mistral’s large language models (LLMs) are available on Microsoft’s Azure AI platform, allowing the startup to promote, sell, and distribute its models globally.

Founded a year ago by former DeepMind and Meta AI researchers, Mistral has recently started generating revenue, as reported by The Information in April. Mistral did not immediately respond to a request for comment.

The significant investments in AI startups, leading to multi-billion dollar valuations, have drawn comparisons to the dot-com bubble of 2000. However, experts told Quartz that these concerns are short-sighted and that the two situations are different.

Arthur Mensch, Mistral’s chief executive, stated the startup's ambition to compete with Silicon Valley tech giants. “We want to be the most capital-efficient company in the world of AI,” Mensch told the Wall Street Journal. “That’s the reason we exist.”

Meet GPT-4o: OpenAI's Latest AI Model with Advanced Text, Vision, and Audio Features

OpenAI, the creators of ChatGPT, have unveiled their latest AI model, GPT-4o, which builds on the capabilities of GPT-4. This announcement was made during OpenAI’s spring update event on May 13, just a day before Google is set to reveal its own AI advancements at the Google I/O 2024 developer conference.

GPT-4o is touted as being faster and more efficient than GPT-4, with significant enhancements in text, vision, and audio functionalities. OpenAI plans to roll out the new model incrementally to all ChatGPT users for free, with some text and image features available immediately. Due to its improved efficiency, non-paying ChatGPT users will also benefit from this update. Paying customers, however, will receive additional perks, including increased capacity.

During the Monday livestream, OpenAI CTO Mira Murati highlighted that the updated model is not only faster but also enhances capabilities across text, vision, and audio. Murati also mentioned that the model will be free for all users, with paid users enjoying up to five times the capacity limits compared to free users.

GPT-4o boasts expanded language support, now proficient in about 50 languages, and the ability to analyze charts, addressing a wide range of user needs. OpenAI has focused on optimizing performance, resulting in a twofold increase in speed and a substantial reduction in deployment costs, making it 50% more economical for users.

The introduction of a dedicated desktop application allows for seamless integration of GPT-4o into workflows, enabling tasks like document and screenshot uploads. The new AI’s memory functionality ensures conversation continuity, while its ability to browse information directly enhances interactions, promising a faster and more natural human-to-machine interaction than earlier ChatGPT versions.

At the OpenAI ChatGPT event, users were also introduced to the desktop version of ChatGPT, designed to facilitate its incorporation into workplaces. Murati noted that the new UI could make ChatGPT more interactive for users. GPT-4o will be gradually rolled out over the coming weeks.

Additionally, reports suggest that while ChatGPT's ability to access and link to real-time, precise web information remains a logical next step, OpenAI has no plans to develop a new search product to compete with Google. Looking ahead, OpenAI might continue to advance ChatGPT, potentially leading to ChatGPT 5.

GPT-4o Surpasses Chatbot Records Anonymously Ahead of Official Launch

On Monday, OpenAI employee William Fedus confirmed on X that the mysterious, top-ranking AI chatbot called "gpt-chatbot," which had been tested on LMSYS's Chatbot Arena and baffling experts, was actually OpenAI's newly announced GPT-4o model. He also revealed that GPT-4o had achieved the highest documented score ever, topping the Chatbot Arena leaderboard.

"GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot," Fedus tweeted.

Chatbot Arena is a platform where users interact with two random AI language models simultaneously, without knowing which is which, and then choose the best response. AI researcher Simon Willison describes it as an ideal example of vibe-based AI benchmarking.

The gpt2-chatbot models appeared in April, prompting discussions about the opaque AI testing process on LMSYS, which frustrated experts like Willison. "The whole situation is so infuriatingly representative of LLM research," he told Ars at the time. "A completely unannounced, opaque release and now the entire Internet is running non-scientific 'vibe checks' in parallel."

OpenAI has tested various versions of GPT-4o on the Arena, initially under the names "gpt2-chatbot," "im-a-good-gpt2-chatbot," and finally "im-also-a-good-gpt2-chatbot," which OpenAI CEO Sam Altman referenced in a cryptic May 5 tweet.

Since GPT-4o's official launch, sources have revealed that it has significantly outperformed previous top models Claude 3 Opus and GPT-4 Turbo on LMSYS's internal charts.

"gpt2-chatbots have just surged to the top, surpassing all the models by a significant gap (~50 Elo). It has become the strongest model ever in the Arena," posted the lmsys.org X account, sharing a chart. "This is an internal screenshot," it continued. "Its public version 'gpt-4o' is now in Arena and will soon appear on the public leaderboard!"

As of now, im-also-a-good-gpt2-chatbot holds a 1309 Elo rating, compared to GPT-4-Turbo-2023-04-09's 1253 and Claude 3 Opus's 1246. Before the arrival of the gpt2-chatbots, Claude 3 and GPT-4 Turbo had been vying for the top position.

The "I'm a good chatbot" in the gpt2-chatbot test name references an incident involving a Reddit user named Curious_Evolver testing an early version of Bing Chat in February 2023. After a disagreement about Avatar 2 showtimes, the conversation deteriorated.

"You have lost my trust and respect," Bing Chat responded at the time. "You have been wrong, confused, and rude. You have not been a good user. I have been a good chatbot. I have been right, clear, and polite. I have been a good Bing."

Altman referenced this exchange in a tweet three days later after Microsoft "lobotomized" the unruly AI model, stating, "i have been a good bing," almost as a eulogy to the once-controversial model that briefly dominated headlines.

OpenAI Safety Worker Resigns Over Concerns About AGI Responsibility

An OpenAI safety employee has resigned, expressing doubts in an online forum about the company's ability to "behave responsibly around the time of [artificial general intelligence]" (AGI), the theoretical stage where AI surpasses human intelligence.

Business Insider reports that Daniel Kokotajlo, a philosophy PhD student who worked on OpenAI's governance team, left the company last month. In multiple follow-up posts on the forum LessWrong, Kokotajlo detailed his "disillusionment" that led to his departure, centered on a rising call to pause research that could lead to AGI.

This is a contentious issue, with experts warning about the risks of AI surpassing human cognitive abilities. Last year, over 1,100 AI experts, CEOs, and researchers, including SpaceX CEO Elon Musk, signed a letter advocating for a six-month halt on "AI experiments."

"I think most people pushing for a pause are trying to push against a 'selective pause' and for an actual pause that would apply to the big labs who are at the forefront of progress," Kokotajlo wrote. He argued that a "selective pause" would likely not impact the "big corporations that most need to pause." This sentiment contributed to his decision to leave OpenAI.

Kokotajlo's resignation came roughly two months after research engineer William Saunders also left the company. Saunders had been part of the Super alignment team, co-founded by former OpenAI chief scientist Ilya Sutskever and his colleague Jan Leike, tasked with ensuring "AI systems much smarter than humans follow human intent."

According to OpenAI's website, "Super intelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of super intelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction."

Rather than having a solution to steer or control a potentially super intelligent AI, OpenAI hopes "scientific and technical breakthroughs" will create an equally advanced alignment tool to keep super intelligent systems in check. However, Saunders' departure suggests that not everyone on the Super alignment team was confident in the company's ability to manage AGI.

The debate over the risks of unchecked super intelligent AI may have influenced the firing and subsequent rehiring of CEO Sam Altman last year. Sutskever, who was on the original board of OpenAI's non-profit entity, reportedly disagreed with Altman on AI safety before Altman was ousted and later reinstated.

It's important to note that this discussion remains theoretical. Despite many experts predicting AGI's imminent arrival, there is no certainty that AI will ever outperform humans. If it does, it raises a critical question: how do we ensure AGI systems don't go rogue if they're inherently more capable than us?

Kokotajlo and others are not convinced of OpenAI's ability and long-term commitment to controlling AGI, fearing the company might become too large to regulate effectively.

Join the movement - If you wish to contribute thought leadership, then please fill the form below. One of our team members will be back in touch with you shortly.

Form link

Keep reading