Last Week in AI #162 - Udio Song AI, TPU v5, Mixtral 8x22, Mixture-of-Depths, Musicians sign...

Last Week in AI
15 Apr 2024105:00

TLDRLast Week in AI #162 discusses the latest AI news, including the launch of Udio Song AI, the release of TPU v5, advancements in Transformers with infinite context, and policy updates on AI safety and ethics. The episode also covers the impact of AI on music and media, with artists calling for responsible AI music practices and OpenAI's Sora creating its first music video.

Takeaways

  • 🎵 Udio, a new music generation platform, has generated excitement with its high-quality song outputs and notable investor backing.
  • 🤖 Anthropic launches an external tool use feature for Claude AI, significantly enhancing its capabilities and competitiveness in the AI space.
  • 🔧 Repet is integrating AI tools for code repair, aiming to fix issues in source code by leveraging extensive code editing histories.
  • 💬 Mixed reviews surround the Human AI pin, a wearable device with AI capabilities, highlighting the need for further development and improvements.
  • 🚀 Microsoft's 365 Co-pilot receives a GPT-4 upgrade, improving image generation and other advanced reasoning capabilities.
  • 🌟 Google announces TPU v5, its most powerful AI accelerator yet, with significant improvements in speed and connectivity.
  • 🧠 Meta unveils a new version of its custom AI chip, the MTI, showing its commitment to pushing forward in AI hardware development.
  • 🔍 Adobe is investing in AI video generation models by paying photographers and artists for short video clips of everyday actions and emotions.
  • 🔊 OpenAI reportedly transcribed over a million hours of YouTube videos to train its GPT-4 model, raising questions about data sourcing and copyright policies.
  • 🚖 WeRide launches its paid robotaxi service in Los Angeles, expanding its operations beyond San Francisco and Phoenix.

Q & A

  • What is the significance of the new Udio Song AI in the music generation space?

    -Udio Song AI is significant because it introduces a new model that generates high-quality music almost indistinguishable from real songs. It was founded by former employees of De mind and has received substantial investment, indicating its potential to be a major competitor in the music generation space.

  • How does the TPU v5 differ from its predecessor, TPU v4?

    -The TPU v5, also known as the Viper Fish, has a faster interconnect and improved performance. It features 8,960 chips and can connect to up to 64 units, offering 2X improvements in flops and 3X improvement in high bandwidth memory compared to TPU v4. It also has a more up-to-date knowledge cutoff, enhancing its reasoning capabilities on complex problems.

  • What is the main concern regarding the use of AI in music generation?

    -The main concern is the potential infringement on copyright data, as AI music generation models may be trained on copyrighted materials. This raises legal and ethical questions about the ownership and usage rights of AI-generated music.

  • What is the role of AI in the field of national security according to the hosts of the podcast?

    -AI plays a significant role in national security as it can be used to analyze and predict potential threats. The hosts mention their involvement in AI national security companies, highlighting the importance of AI in this field.

  • What are the potential applications of AI-generated music?

    -AI-generated music can be used for various purposes, including educational tools to help memorize information, marketing and commercial applications, and entertainment. The hosts also mention the possibility of using catchy beats for memorization purposes.

  • How does the new TPU v5 affect the development of AI models?

    -The TPU v5 provides a more powerful and efficient platform for training AI models. Its improved performance and connectivity allow for the development of more advanced AI models, particularly those requiring large-scale training runs.

  • What is the significance of the AI safety institute mentioned in the Canadian government's investment?

    -The AI safety institute is significant as it represents a dedicated effort to address the risks associated with advanced AI systems. The institute aims to protect against the misuse of AI and ensure its development is safe and beneficial for society.

  • What is the main challenge for AI in the field of music generation?

    -The main challenge is ensuring that AI-generated music does not infringe on the rights of human artists and that it adds value rather than undermining human creativity. This involves finding a balance between leveraging AI's capabilities and respecting the contributions of human artists.

  • What is the potential impact of AI on the music industry?

    -AI has the potential to significantly impact the music industry by altering the way music is created, produced, and consumed. It could lead to new forms of musical expression, but also raises concerns about the value of human artistry and the need for fair compensation for artists' work.

  • What is the significance of the open letter signed by artists regarding AI music practices?

    -The open letter signifies the growing concern among artists about the impact of AI on the music industry. It calls for responsible AI practices that respect the rights and value of human artists, highlighting the need for fair compensation and the avoidance of AI that replaces or undermines human creativity.

Outlines

00:00

🎙️ Introduction and Personal Updates

The episode begins with hosts Andre and Jeremy introducing themselves and sharing some personal updates. Andre, a Stanford PhD graduate, now works at a generative AI startup, while Jeremy is the co-founder of Gladstone AI, an AI National Security Company. They discuss a recent media appearance and its aftermath, as well as a potential new podcast intro song generated by AI. Jeremy also shares his surprise encounter with John Cron from the Super Data Science podcast and Stie St Lawrence, founder of Women and Data, in New York City.

05:01

🎵 AI Music Generation Excitement

The hosts dive into the AI music generation space, discussing the recent developments and the excitement surrounding tools like Yudo and Sunno. They mention the impressive quality of AI-generated music, which is almost indistinguishable from real songs. The conversation also touches on the potential applications of music generation, such as educational aids and marketing, and the significant investments in this area, highlighting the potential for game-changing impact.

10:01

🤖 Anthropic's External Tool Use for CLAI

The discussion shifts to Anthropic's launch of an external tool use feature for CLAI, enabling stock taker integrations and more. The hosts explain how this feature allows Claude to use third-party features and APIs, significantly enhancing its capabilities. They also delve into the accuracy of tool selection and use, which is a key metric in tracking progress towards AGI, and the potential implications of these advancements.

15:03

🔧 Building LLMs for Code Repair

The hosts cover the development of building LLMs for code repair from Replit, discussing how AI tools are being integrated to fix bugs in source code. They highlight Replit's unique data advantage, derived from its platform's history of code edits, and the potential for AGI development in this area. The conversation also touches on the open-source movement and its significance in the AI community.

20:05

📱 Early Reviews of Human AI Pin

The episode continues with a discussion on the early reviews of the Human AI pin, a wearable device that functions as a new type of AI-first hardware. The hosts share insights on the device's initial reception, noting that it may require further refinement due to issues like slow response times and bugs. They also ponder the future of AI-driven devices and their potential to redefine hardware categories.

25:06

🖼️ Microsoft's 365 Copilot Gets a GPT-4 Upgrade

The hosts discuss Microsoft's integration of GPT-4 into its 365 Copilot, enhancing image generation capabilities for business subscribers. They delve into the improvements brought by GPT-4 Turbo, such as updated knowledge cutoff and better reasoning capabilities. The conversation also touches on the competitive landscape in AI, with companies like Google and OpenAI striving to outperform each other in terms of technology advancements.

30:07

🎨 AI Editing Tools for Google Photos

The hosts highlight the expansion of AI editing tools to all Google Photos users, including features like Magic Eraser, photo blur, and portrait light. They discuss the implications of this move in the context of smartphone competition and the increasing integration of AI-powered features in devices. The conversation also touches on Google's announcement of the Cloud TPU v5p, its most powerful AI accelerator yet, and its potential impact on large language model training.

35:09

💽 Meta's Custom AI Chip Unveiling

The hosts discuss Meta's unveiling of a new version of its custom AI chip, the Meta Training and Inference Accelerator (MTI). They compare Meta's efforts to those of Google and Microsoft in the AI chip space, highlighting the importance of hardware advancements in the tech industry. The conversation also explores the potential for profit in the hardware level of the AI stack and the strategic moves by major tech companies to develop their own chip designs.

40:12

🔧 Intel's AI Accelerator Announcement

The hosts talk about Intel's new AI accelerator, the Vatti Freeship, which is set to enhance training and inference performance. They discuss Intel's claims about the accelerator's performance and power efficiency compared to Nvidia's H100, and the potential impact on the semiconductor market. The conversation also touches on the importance of production scaling and the challenges Intel faces in catching up with competitors like Nvidia.

45:13

🎥 Adobe's AI Video Generation Model

The hosts discuss Adobe's efforts to build an AI video generation model by paying photographers and artists to submit videos of everyday actions and emotions. They highlight Adobe's strategy of using proprietary content for training to avoid copyright issues, and the potential for this approach to become more common in the future. The conversation also touches on OpenAI's reported transcription of YouTube videos to train its GT4 model, and the legal gray areas surrounding data usage in AI training.

50:15

🚗 Weo's Paid Robotaxi Service Launch

The hosts mention Weo's launch of its paid robotaxi service in Los Angeles, following a year of offering free tours. They discuss the expansion of the service, which has seen significant interest from the public, and the potential for further growth in other cities. The conversation also includes brief mentions of other companies testing similar services in different locations.

55:16

📜 OpenAI's Fund Structural Changes

The hosts discuss recent changes in the ownership structure of OpenAI's startup fund, which was previously owned by Sam Altman. They highlight the unusual nature of the fund's structure and the potential implications of the shift to a more traditional partnership model. The conversation also touches on the lack of clarity surrounding the fund's operations and the broader intrigue surrounding OpenAI's organizational structure.

00:18

🌐 Mistral's New Mix Draw 8X 22B Model

The hosts highlight the launch of Mistral's new Mix Draw 8X 22B model, an open-source AI model with a large parameter set and no restrictions on commercial use. They discuss the model's capabilities, its significance in the open-source community, and the potential impact on AI development. The conversation also touches on the challenges of using large models that require significant computational resources.

05:18

📈 Google's New Light-weight AI Models

The hosts discuss Google's announcement of new additions to its Gemma family of lightweight open-source AI models, including Code Gemma for coding and Recurrent Gemma with a recurrent structure. They highlight the models' efficiency, performance, and potential applications, as well as the focus on throughput and the importance of efficient hardware use in AI systems.

10:19

🌐 Aurora-M: First Multilingual LLM

The hosts discuss the launch of Aurora-M, the first open-source multilingual language model developed in accordance with the US executive order. They highlight the model's rigorous evaluation across various tasks and languages, and its alignment with safety and ethical considerations. The conversation also touches on the model's performance and the significance of its development in the context of regulatory frameworks.

15:21

🧠 Research and Advancements in AI

The hosts delve into recent research and advancements in AI, including DeepMind's work on dynamically allocating compute in Transformer-based language models and Google's approach to scaling Transformers with infinite context. They discuss the potential impact of these developments on AI capabilities and the ongoing efforts to optimize AI models for better performance and efficiency.

20:22

🔒 Policy and Safety in AI Development

The hosts discuss policy and safety issues in AI development, including a new legislation proposal by Representative Adam Schiff requiring organizations to disclose the use of copyrighted data in AI training. They also cover the case of a Google software engineer charged with stealing AI secrets and the broader implications for intellectual property and national security in the AI sector.

25:22

📰 Lightning Round of AI News

The hosts wrap up the episode with a lightning round of quick AI news stories, including a policy paper on responsible reporting for Frontier AI development, the US government's efforts to address AI electricity demands, a legal case blocking the use of AI-enhanced video as evidence, and a significant investment in AI by the Canadian government, including the launch of an AI safety institute.

Mindmap

Keywords

💡AI

Artificial Intelligence (AI) refers to the simulation of human intelligence in machines that are programmed to think, learn, and act like humans. In the context of the video, AI is the central theme, with discussions around its advancements, applications, and implications in various fields such as music, national security, and technology development.

💡Music Generation

Music generation is a process where AI algorithms are used to create original music or songs by learning patterns from existing music data. In the video, the hosts discuss the exciting developments in music generation with the introduction of new AI models like Udio, which can produce high-quality music tracks that are almost indistinguishable from those created by humans.

💡TPU

Tensor Processing Unit (TPU) is a hardware accelerator developed by Google specifically designed to speed up machine learning workloads. TPU v5 is mentioned in the video as the latest version, which offers significant improvements in performance and efficiency for training large AI models.

💡AI National Security Company

An AI National Security Company is an organization that specializes in the application of AI technologies to enhance national security measures. This can include developing AI tools for intelligence analysis, surveillance, and defense strategies. In the video, one of the hosts co-founded Gladstone AI, which is an example of such a company.

💡Mixture-of-Depths

Mixture-of-Depths is a machine learning technique that allows a model to dynamically allocate computational resources to different parts of the input data based on their relevance. This technique can improve the efficiency and performance of AI models, particularly in the context of processing long sequences of data.

💡AI Ethics

AI Ethics refers to the moral principles and values that guide the development and use of AI technologies. It encompasses issues such as fairness, accountability, transparency, and the impact of AI on society. In the video, ethical considerations are discussed in the context of AI's potential to be used in ways that may infringe on human rights or exacerbate social inequalities.

💡AI Music Video

An AI music video is a video created using artificial intelligence to generate or enhance visual content that is synchronized with music. These videos can be fully produced by AI, or AI can be used to assist in the creative process, providing unique visual experiences that match the music's mood or theme.

💡AI Training Transparency

AI Training Transparency refers to the practice of disclosing the methods, data sources, and processes used to train AI models. This transparency is important for understanding the potential biases, limitations, and capabilities of AI systems, and it can help ensure that AI is developed and used responsibly.

💡AI Safety Institute

An AI Safety Institute is an organization dedicated to researching and promoting safe practices in the development and deployment of AI technologies. These institutes often focus on preventing harmful consequences from AI, such as loss of control, misuse, or ethical dilemmas.

💡AI in Agriculture

AI in Agriculture refers to the integration of artificial intelligence technologies into farming practices to improve crop yields, manage resources more efficiently, and optimize agricultural processes. AI can be used for tasks such as predicting weather patterns, analyzing soil quality, and automating farm equipment.

Highlights

Udio, a new music generation platform, has generated a lot of excitement with its high-quality song production.

Udio was founded by four former employees of De mind and has already raised $10 million in funding.

The music generation space now has two major competitors, Udio and Sunno, both producing impressive tracks.

Anthropic launches external tool use for CLAI, enabling stock taker integrations and more.

Tropic's tool use functionality allows Claude to use third-party features and APIs, enhancing its competitiveness.

Repet is integrating AI tools for code repair, utilizing a mixture of source code and natural language.

Human AI pin reviews indicate that the device may need further improvement, with issues of slow response and bugs.

Microsoft's 365 Co-pilot receives a GPT-4 upgrade, improving image generation capabilities.

Google Photos users will now have access to AI editing tools like Magic Eraser and photo blur.

Google announces Cloud TPU v5, its most powerful AI accelerator yet with significant improvements in performance.

Meta unveils a new version of its custom AI chip, the MTI, with three times better overall performance compared to its predecessor.

Intel unveils its new AI accelerator, the Vatti Freeship, which aims to enhance performance in training and inference.

Adobe is buying videos to build AI models, offering $3 per minute for videos of everyday actions and emotions.

OpenAI reportedly transcribed over a million hours of YouTube videos to train its models, raising questions about data usage policies.

Weo will launch its paid robotaxi service in Los Angeles, expanding from its initial offerings in San Francisco.

OpenAI removes Sam Altman's ownership of its startup fund, clarifying the structure and intentions behind the fund.

Mistral AI surprises with the launch of its new Mixtral 8x22b model, boasting 176 billion parameters and a large token context window.

Google introduces Code Gemma and Recurrent Gemma, lightweight AI models for coding assistance and efficient processing.

Aurora M, the first open source multilingual language model, is released with a focus on adhering to US executive order considerations.

DeepMind explores mixture of depths, allowing for dynamic allocation of compute in Transformer-based language models.

Google proposes an efficient infinite context Transformer model with bounded memory and computation, enabling long input processing.

Octopus V2 is an on-device language model that enables high accuracy and reduced latency for edge device deployment.

A study finds that smaller latent diffusion models can be more efficient, producing better image quality at the same computational cost.

Anthropic's paper on many-shot jailbreaking demonstrates how language models can be tricked into responding to prompts they're trained to avoid.