• "Towards AGI"
  • Posts
  • The Holy Grail for Chief AI Officers (Part 2): Mastering KYI and Leverage Inference Platforms

The Holy Grail for Chief AI Officers (Part 2): Mastering KYI and Leverage Inference Platforms

In my previous articles on "Know Your Inference" (KYI), we delved into why mastering KYI is essential for Chief AI Officers. KYI ensures GenAI models deliver accurate, transparent, and reliable results at optimal cost, which are crucial for maintaining trust and compliance. Building on this foundation, it's important to explore the role of inference platforms in AI deployment. These platforms are the backbone of operationalising AI, providing the necessary infrastructure to ensure models are scalable, efficient, and effective in real-world applications. By integrating KYI principles with robust inference platforms, CAIOs can significantly enhance their AI strategies and drive better business outcomes.

Defining Inference Platforms

An inference platform is a system or framework designed to deploy and manage machine and deep learning models, enabling them to make predictions or inferences based on new data. These platforms streamline the process of taking a trained model from development to production, ensuring efficient, scalable, and reliable model deployment.

Available Inference Platforms: Pros and Cons

1. Hugging Face Transformers

  • Pros: Extensive model library, strong community support, easy integration.

  • Cons: Can be resource-intensive, especially with large models.

2. ONNX Runtime

  • Pros: Cross-platform support, optimised for various hardware, high performance.

  • Cons: Limited support for some specific model types and custom operations.

3. NVIDIA Triton

  • Pros: High performance, supports multiple frameworks, efficient GPU utilisation.

  • Cons: Requires NVIDIA hardware, may be complex for beginners.

4. Ray Serve

  • Pros: Scalable, flexible, designed for production environments, supports complex workflows.

  • Cons: Can be challenging to configure and manage.

5. TensorFlow Serving

  • Pros: Part of a comprehensive ecosystem, high performance, widely adopted.

  • Cons: Best suited for TensorFlow models, may require significant setup.

Why Chief AI Officers should understand platform paradigms

Chief AI Officers (CAIOs) must grasp inference platform paradigms to ensure the successful deployment and scaling of AI models. Understanding these paradigms helps in selecting the right platform that aligns with the organisation’s infrastructure, performance needs, and strategic goals.

How to pick the right Inference Platform: A Detailed Guide

1. Evaluate Compatibility

  • Framework Support: Ensure the platform supports the AI frameworks you use. For instance, if you primarily use TensorFlow, TensorFlow Serving is an excellent choice due to its seamless integration.

  • Example: A company using PyTorch may prefer Hugging Face Transformers for its extensive library of pre-trained models compatible with PyTorch.

2. Consider Scalability

  • Load Handling: Select a platform that can scale to handle your anticipated workload. Platforms like Ray Serve are designed to scale horizontally, making them ideal for growing businesses.

  • Example: An e-commerce company anticipating increased traffic during holiday seasons might choose NVIDIA Triton for its ability to scale across multiple GPUs efficiently.

3. Assess Performance

  • Latency and Throughput: Evaluate the platform's performance in terms of latency and throughput. ONNX Runtime is known for its optimised performance across different hardware, making it suitable for latency-sensitive applications.

  • Example: A financial services firm needing real-time fraud detection might select ONNX Runtime for its low-latency inference capabilities.

4. Analyse Ease of Use

  • Learning Curve: Consider the platform’s ease of use and the learning curve for your team. Hugging Face Transformers, with its user-friendly API and extensive documentation, is often easier for teams new to inference platforms.

  • Example: A startup with limited resources and expertise might opt for Hugging Face Transformers due to its straightforward integration and supportive community.

5. Review Community and Support

  • Community and Documentation: A strong community and comprehensive documentation can significantly ease the deployment process. TensorFlow Serving, being part of the larger TensorFlow ecosystem, benefits from extensive community support and documentation.

  • Example: An organisation looking for robust support and regular updates may choose TensorFlow Serving to leverage the active TensorFlow community.

Why Inference Platforms are the holy grail for Chief AI Officers

Inference platforms ensure that AI models are not just built but are deployed effectively, delivering real-world value. They offer the scalability, performance, and reliability needed to operationalise AI across various business functions. By mastering inference platforms, CAIOs can guarantee their AI solutions deliver consistent, transparent, and high-quality results, cementing AI's role as a strategic asset in their organisations.

Conclusion

Choosing the right inference platform is critical for the effective deployment of AI models. By evaluating compatibility, scalability, performance, ease of use, and community support, CAIOs can make informed decisions that align with their organisational goals and technical requirements. This strategic selection process ensures that AI applications are not only powerful and efficient but also transparent and trustworthy, cementing AI’s role as a cornerstone of modern business strategy.

Kind regards,

HCL Technologies Introduces Enterprise AI Foundry to Accelerate GenAI Business Integration

HCL Technologies (HCLTech) announced on Monday the introduction of the Enterprise AI Foundry, designed to streamline and enhance AI implementation across various business sectors. According to a company statement, this comprehensive suite merges data engineering and AI with cognitive infrastructure, promoting the acceleration of transformation led by Generative AI (GenAI) throughout business value chains.

The Enterprise AI Foundry by HCLTech is optimized for major cloud platforms like Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP), and can also be scaled for on-premises setups. The platform aims to simplify the complexities associated with large-scale AI models, data isolation, and the excessive number of tools and frameworks, thereby enabling IT leaders to achieve better integration of IT and data resources.

This initiative not only allows business leaders to concentrate on tangible business outcomes but also supports development teams in effortlessly creating cutting-edge AI-driven applications.

Following the earlier launch of HCLTech AI Force, the Enterprise AI Foundry is set to fast-track AI-driven transformation in business processes and strategies. Srini Kompella, Senior Vice President of Data and AI at HCLTech, highlighted that the Enterprise AI Foundry will ease the core AI infrastructure challenges, facilitate the integration of enterprise data with AI technologies, simplify the development of AI applications, and ensure trust, safety, and reliability, thus promoting confident adoption among users.

Apple Expands Developer Tools with GenAI 'Apple Intelligence' and Enhanced Siri Features

At the WWDC 2024 keynote on Monday, Apple unveiled its latest generative AI technology, Apple Intelligence, which extends beyond consumer use to developer integration. Apple has enhanced its software development kits (SDKs) with new APIs and frameworks that enable developers to incorporate features such as GenAI image creation into their applications with minimal coding. For instance, the app Craft could now enhance document visuals with AI-generated images.

Moreover, AI-enhanced writing features will be readily available in apps using the standard text editor interface. An example provided was the Bear Notes app, which can now offer functions like rewriting, proofreading, and summarizing directly within the app.

Apple is also expanding the capabilities of Siri within apps. Existing users of SiriKit will not need to make any adjustments to benefit from upgrades in functionalities related to lists, notes, media, messaging, payments, restaurant bookings, VoIP calls, and workouts.

During the Developer keynote, Apple introduced two additional Siri enhancements. First, Siri can now execute commands from an app’s menu, such as retrieving presentation notes on command. Secondly, Siri can interact with any displayed text via Apple’s text systems, allowing for actions based on verbal commands, like initiating a FaceTime call through a reminder.

Apple also updated the App Intents framework, which supports simple interactions without full app installation, to integrate with Apple Intelligence. New intents are being rolled out in areas like photo editing, document management, and more, simplifying implementation for developers.

Furthermore, Apple plans to gradually enable Siri to use app intents across various domains through the Shortcuts app, enhancing Siri’s conversational and search capabilities through a new Spotlight API. This API allows Siri to index and interact with data like photos and messages from various apps.

Apple is set to enhance user communication and creative expression across its devices with the introduction of Writing Tools in iOS 18, iPadOS 18, and macOS Sequoia. These tools, powered by Apple Intelligence, enable users to write, rewrite, proofread, and summarize text within various applications such as Notes, Pages, Mails, and even third-party apps. The Rewrite feature offers users alternative versions of their text, adjusting tone to match the audience and context, while the Summarize feature can condense text into digestible formats like paragraphs, bullet points, tables, or lists.

In email management, Apple Intelligence simplifies the process with features like Priority Messages, which highlights urgent emails at the top of the inbox. It also provides email summaries directly in the inbox and Smart Reply options that suggest responses and highlight questions to ensure comprehensive replies.

Beyond text, Apple Intelligence ventures into visual creativity with the Image Playground feature, allowing users to quickly create images in styles such as animation, illustration, or sketch. This feature is integrated into various apps, including messages, and is complemented by a dedicated app where users can experiment with different themes, costumes, accessories, and locations by simply describing the desired image.

Furthermore, Siri has been upgraded to be more contextually aware and linguistically capable, enhancing its ability to handle natural language and maintain context over continuous interactions. A new visual design indicates when Siri is active, and despite its increased awareness, Apple assures that privacy and security are preserved. Apple Intelligence processes data on-device and utilizes Private Cloud Compute, enabling more complex processing on server-based models without compromising user privacy. This setup ensures that personal information is handled securely, aligning with Apple’s commitment to user privacy.

Pega Launches GenAI Blueprint 2.0, Empowering Enterprises with More AI Flexibility

Pega is enhancing its generative AI (GenAI) capabilities, introducing significant updates to its Blueprint platform and expanding GenAI enterprise options through collaborations with Amazon Web Services (AWS) and Google Cloud. The announcements were made at PegaWorld iNspire, the company's annual event, where Pega affirmed its leadership in enterprise AI decisioning and workflow automation. The enhanced Pega GenAI Blueprint – an app design-as-a-service tool critical for managing workflows – now boasts a more advanced user interface and personalised workflow options that adhere to industry best practices for nearly any situation.

Moreover, Pega users will now have access to large language models (LLMs) via new integrations with AWS and Google Cloud, providing a range of choices to best suit their needs based on factors like strategy, infrastructure, effectiveness, and cost. Don Schuerman, Pega’s CTO, highlighted the evolving nature of the generative AI market and the strategic importance of offering clients multiple options.

The upgraded Blueprint platform facilitates rapid transformation from app concepts to functional designs, streamlining legacy processes and enhancing personalization. New features include legacy transformation accelerators, live app previews, an improved user interface, data model generation, enhanced collaboration, superior idea generation, and partner-supplied templates.

Kerim Akgonul, Chief Product Officer at Pega, expressed enthusiasm about the transformative impact of the GenAI Blueprint, which has quickly shown its potential to revolutionise app design and organisational innovation. The enhanced capabilities are expected to launch in the second half of 2024, along with new GenAI services and models from AWS and Google Cloud.

AWS will introduce Amazon Bedrock, providing access to high-performance foundational models via a single API. Google Cloud will offer Vertex AI, Google Gemini, and Claude from Anthropic, emphasising security, privacy, and responsible AI practices. These models will be accessible through Pega Connect GenAI, a plug-and-play architecture that allows low-code developers to integrate GenAI into various workflows easily.

Rodrigo Rocha, Head of Global Apps ISV Partnerships at Google Cloud, also commented on the expansion of their partnership with Pega, noting the availability of over 150 models in Vertex AI to benefit Pega’s clientele.

Unlocking AI Potential with RHEL AI: A Dive into Open Source Artificial Intelligence

Have you ever been overwhelmed by endless reviews when trying to pick a new restaurant? Imagine having a personal assistant that sifts through feedback and selects the best spot based on your criteria. What if AI could help with that? Unfortunately, many AI models are locked behind commercial barriers, limiting their use and development by smaller organisations and independent developers. This is where RHEL AI steps in. Designed for both hybrid and cloud environments, RHEL AI is an open-source platform that democratises AI development. It leverages open-source principles, a robust architecture, and an active community, aiming to make AI more accessible, compatible, and reliable. With RHEL AI, individuals aren't confined to relying on big tech firms for AI implementations and can create diverse applications, from restaurant guides to spam filters.

Red Hat Enterprise Linux (RHEL) AI offers a comprehensive suite of tools and frameworks built on the reliable RHEL foundation to facilitate the development, deployment, and management of AI and machine learning applications. Embracing the open-source ethos, RHEL AI provides a secure and flexible infrastructure for AI practitioners. Introduced at the Red Hat Summit 2023 alongside Red Hat OpenShift AI, it aims to mainstream AI with open-source methodologies. In collaboration with IBM Research, Red Hat is also advancing open-sourced AI language and code assistance models, promoting a more accessible and practical approach to AI.

A key component of this initiative is InstructLab, an open-source project that enables users to enhance AI models through a user-friendly interface. Unlike other methods that might necessitate forking a model, InstructLab allows for contributions to be integrated into future iterations, fostering a more democratic AI development process. This platform empowers domain experts and enthusiasts to develop AI applications, regardless of their expertise in data science. Jim Whitehurst, former CEO of Red Hat, champions the idea that InstructLab enables anyone to build AI models, irrespective of their professional background.

Exploring the Security Challenges of Open-Source Generative AI

Open-source software is rapidly becoming a dominant force in the tech industry, with the 2024 State of Open Source Report revealing that over two-thirds of companies have increased their usage of open-source software over the past year.

Generative AI is a prime example of this trend, seeing a significant rise in contributions from developers on platforms like GitHub. This surge in interest is reflected in the billions being invested by organisations into generative AI for a variety of applications, from customer service bots to automated coding tools. These projects often start with open-source foundations or are built from scratch as proprietary systems.

However, the proliferation of generative AI is not just attracting legitimate enterprises but also malicious entities. These range from nation-states spreading disinformation to cybercriminals crafting targeted phishing attacks or harmful software.

Presently, safety mechanisms are a key barrier preventing misuse of AI. Major closed-source models like ChatGPT and MidJourney enforce strict content guidelines to prevent the generation of harmful material. Despite these restrictions, there have been efforts to circumvent such safety measures, often referred to as "jailbreaking."

The growing adoption of open-source models, however, is likely to diminish the effectiveness of these safeguards. Open-source AI allows for more flexibility in data usage, which can accelerate improvements and foster transparency and competition. Yet, this same openness makes these models vulnerable to misuse. Notable examples include FraudGPT and WormGPT from the dark web, both of which are derived from the open-source GPT-J model by EleutherAI.

Moreover, the potential for AI misuse extends to open-source image synthesis models like Stable Diffusion, which are being adapted to produce abusive content. The development of AI-generated video content is also advancing, driven by the availability of robust open-source models and the requisite computing power.

The risks extend beyond external threats; building proprietary AI models increases an organisation's vulnerability internally. Issues during the training phase, such as the inclusion of sensitive or incorrect data, can lead to unintended outputs later on. Furthermore, prompt injection attacks present a continual risk, particularly in open-source environments where oversight may be lacking.

In open-source AI systems developed by entities like Stability AI, EleutherAI, or Hugging Face, or even in proprietary systems built in-house, there are no inherent safeguards against misuse. The openness of these models, while promoting innovation and democratisation, also exposes them to significant security risks.

In conclusion, while open-source AI models offer great potential for advancing technology, they also pose a growing threat that businesses must navigate without relying solely on regulatory bodies. AI tools themselves are becoming essential in combating these cybersecurity challenges. For more insights, consider exploring our guide on AI and cybersecurity.

Elon Musk Warns of iPhone Prohibition in Response to Apple's OpenAI Partnership

On Monday, Elon Musk announced that he would prohibit the use of Apple devices at his companies, including Tesla, SpaceX, and his social media firm X, if Apple were to incorporate OpenAI technology directly into its operating system. Musk labeled such an integration as a "security violation" and stated that Apple devices would need to be stored in a Faraday cage upon entering his companies.

This statement followed Apple's earlier announcement of new AI features across its applications and operating systems, as well as a partnership with OpenAI to integrate ChatGPT technology into its devices. Apple has claimed to prioritise privacy in its AI development, using a mix of on-device processing and cloud computing.

Musk criticised Apple's capability, expressing skepticism over Apple's partnership with OpenAI and questioning their ability to ensure user security and privacy. He tweeted, “It's patently absurd that Apple isn't smart enough to make their own AI, yet is somehow capable of ensuring that OpenAI will protect your security & privacy!"

Industry experts, such as Ben Bajarin, CEO of Creative Strategies, doubted that others would adopt Musk's stance, emphasising Apple's efforts to maintain data security and privacy, even when utilizsng cloud services.

Musk, who had previously co-founded OpenAI in 2015, sued the organisation and its CEO, Sam Altman, earlier in March. He accused them of deviating from their initial mission of developing AI for humanitarian purposes rather than profit. Additionally, Musk has established his own AI venture, xAI, aimed at creating an alternative to the popular ChatGPT chatbot. xAI was recently valued at $24 billion after a $6 billion Series B funding round.

Kling Challenges OpenAI Sora: New Chinese AI Model Delivers Enhanced Video Accuracy

In February, OpenAI introduced Sora, a video generation model capable of producing one-minute, high-definition videos. Before Sora has become widely available, a new text-to-video model from Kuaishou Technology, known for its short-video platform, is already making waves. Named Kling, this new model reportedly harnesses technology akin to Sora and can generate 1080p high-definition videos up to two minutes long that realistically mimic physical world characteristics.

Kuaishou released a demo video this Thursday, demonstrating Kling's capabilities. The model, which has been developed in-house by Kuaishou's LLM team, is currently offered on an invite-only basis through the Kuaiying app, a video shooting and editing application from Kuaishou.

AI enthusiasts have taken to X to post videos created using Kling, which supports video generation up to two minutes at 30fps, surpassing Sora's current one-minute limit. User feedback on X suggests that Kling's outputs closely replicate real-world physics. It's worth noting that Kling is not the only text-to-video model emerging from China; Vidu AI, capable of producing 16-second 1080p videos, was launched in April. Kling is based on the Diffusion Transformer architecture, enabling it to transform text prompts into detailed visual content.

Additionally, Kling incorporates sophisticated 3D face and body reconstruction technologies through Kuaishou’s proprietary 3D VAE technology. This allows users to create videos in various aspect ratios, with variable resolution training facilitating full body movement and expression from just a single full-body image.

China's rapid progress in AI model development is evident with Kling, posing potential competition to OpenAI’s Sora in the Chinese market.

Sarah Friar Joins OpenAI as Chief Financial Officer in Strategic Leadership Update

OpenAI has recently announced the appointment of Sarah Friar as the new Chief Financial Officer and Kevin Weil as Chief Product Officer, marking a strategic move to bolster its global expansion and enhance research initiatives.

Sarah Friar, who previously led Nextdoor as CEO, will now manage OpenAI's financial operations and focus on sustaining investment in core research areas while expanding the company's operations to accommodate increasing demand for its AI products and services. Kevin Weil, coming from a role as President of Product and Business at Planet Labs, will direct the product team, aiming to leverage OpenAI's research for the benefit of consumers, developers, and various businesses.

OpenAI's CEO, Sam Altman, expressed confidence in the appointments, stating, "Sarah and Kevin bring a depth of experience that will enable OpenAI to scale our operations, set a strategy for the next phase of growth, and ensure that our teams have the resources they need to continue to thrive."

The appointments follow the recent departure of co-founder and chief scientist, Ilya Sutskever, and the appointment of Jakub Pachocki as OpenAI's new Chief Scientist earlier this year.

In her career, besides leading Nextdoor, Friar has held significant roles at Square, Goldman Sachs, McKinsey, and Salesforce. She is actively involved in several boards including Walmart and Consensys and is recognised as a Fellow of the Aspen Institute and Co-Chair of the Stanford Digital Economy Lab.

Weil's extensive experience includes co-founding the Libra cryptocurrency and leading product teams at major companies such as Facebook, Instagram, and Twitter. He is also a term member of the Council on Foreign Relations and contributes to the boards of The Nature Conservancy and the Black Product Managers Network.

Both Friar and Weil commented on their new roles. Friar emphasised her commitment to enhancing OpenAI's research and maximising the utility of AI tools, stating, "My goal is to help OpenAI continue excelling at what it does best—producing top-tier research and collaborating to maximise the benefits of AI tools for everyone." Weil focused on continuing the legacy of innovation at OpenAI, noting, "The product team at OpenAI has set the pace for both breakthrough innovation, and thoughtful deployment of AI products."

These strategic appointments are part of OpenAI’s effort to extend its influence globally, aiming to reach hundreds of millions of consumers, millions of developers, and the world's largest companies with its advanced AI research and products.

Towards AGI: 1st Flagship in-person Event in London, UK.

We are thrilled to announce "Towards AGI: 1st Flagship Event," an in-depth exploration of advancements towards artificial general intelligence. This inaugural in-person event, hosted by the Explore Group, will take place in London and presents a unique opportunity for AI enthusiasts and professionals to connect and discuss the implementation of Generative AI in their organizations. The focus of this event will be on GenAI in Banking and Insurance.

Scheduled for June 13, 2024, from 5 PM to 8 PM, the event will be held at Portsoken House, 155-157 Minories, London EC3N 1LJ. Don’t miss out on this chance to engage in insightful discussions and expand your network in the field of AI.

We are delighted to invite you to an evening that promises to be both stimulating and informative. The main event will feature a Fireside Chat titled "GenAI in Banking," lasting 40 minutes. During this session, experts will explore banking and asset management use cases, discussing how organizations are balancing governance concerns while exploring Generative AI. The chat will also include a Q&A session, allowing the audience to interact directly with the speakers and ask insightful questions.

Following the Fireside Chat, we will have Live Demonstrations lasting 30 minutes, showcasing real-world applications and use cases of Generative AI. This segment will include a demonstration of an insurance underwriter workflow with an embedded GenAI assistant and a walkthrough of an AI Chatbot designed for central metadata management in banks.

Join us for an evening of insightful discussions, live demonstrations, and networking opportunities as we delve into the world of Generative AI in banking and insurance. We look forward to seeing you there!

Click here to fill the event form

Keep reading

In our quest to explore the dynamic and rapidly evolving field of Artificial Intelligence, this newsletter is your go-to source for the latest developments, breakthroughs, and discussions on Generative AI. Each edition brings you the most compelling news and insights from the forefront of Generative AI (GenAI), featuring cutting-edge research, transformative technologies, and the pioneering work of industry leaders.

Highlights from GenAI, OpenAI and ClosedAI: Dive into the latest projects and innovations from the leading organisations behind some of the most advanced AI models in open-source, closed-sourced AI.

Stay Informed and Engaged: Whether you're a researcher, developer, entrepreneur, or enthusiast, "Towards AGI" aims to keep you informed and inspired. From technical deep-dives to ethical debates, our newsletter addresses the multifaceted aspects of AI development and its implications on society and industry.

Join us on this exciting journey as we navigate the complex landscape of artificial intelligence, moving steadily towards the realisation of AGI. Stay tuned for exclusive interviews, expert opinions, and much more!