Google Releases AI AGENT BUILDER! 🤖 Worth The Wait?

Matthew Berman
12 Apr 202434:20

Summary

TLDRGoogle Cloud's 2024 keynote introduced Vertex AI, a fast-growing enterprise AI platform featuring a model garden with over 130 models, including Gemini 1.5 Pro, Claude, and stable diffusion. Gemini 1.5 Pro, now in public preview, supports a massive 1 million token context window, enabling it to process extensive information streams like hour-long videos and large codebases. The platform also offers an agent framework for creating customer service agents and workplace assistants that can perform tasks and automate workflows. Additionally, Google announced the Vertex AI agent Builder, which simplifies the creation of powerful customer agents through natural language instructions and vector-based search. The keynote also highlighted a new AI-powered video creation app, Google Vids, which integrates with Google Docs and Sheets for storytelling at work. Lastly, Gemini's code assist feature was showcased, demonstrating its ability to help developers navigate and modify large codebases efficiently.

Takeaways

  • 🚀 Google has launched an agent platform with a focus on enterprise AI solutions, highlighting the capabilities of the Vertex AI platform.
  • 🌟 The Vertex AI Model Garden provides access to over 130 models, including both open source and proprietary models like Claude from Anthropic.
  • 📚 Gemini 1.5 Pro, a model with a large context window of up to 1 million tokens, is now in public preview, allowing for processing vast amounts of information.
  • 🔍 Google is enhancing Gemini 1.5 Pro with the ability to process audio, enabling cross-modality analysis for tasks such as searching within audio and video content.
  • 📈 Code Gemma, a fine-tuned lightweight open model designed for coding, is now available, showcasing Google's commitment to open-source contributions.
  • 🤖 Vertex AI Agent Builder allows users to create customer agents with natural language capabilities and custom voice models, aiming to streamline customer service.
  • 🔗 Google Workspace is integrating AI to enhance productivity, with new features like Google Vids for video creation and advanced code assistance using Gemini models.
  • 🌐 Google is emphasizing the importance of multilingual support and cross-referencing large datasets with the help of AI, as seen with the benefits enrollment example.
  • 📱 Gemini models are being incorporated into smart devices by companies like Oppo and OnePlus to deliver innovative customer experiences.
  • 🔍 Vertex AI's search capabilities are leveraged to provide vector-based and keyword-based search, connecting internal information with the entire web for improved response quality.
  • 📈 Google is positioning itself as a cloud provider that offers a wide range of models, including first-party, third-party, and open-source options, for various enterprise needs.

Q & A

  • What is the main topic of the announcement from Google Cloud Next 2024?

    -The main topic is the launch of Google's agent platform, Vertex AI, and its various features including model Garden, Gemini 1.5 Pro, and the Vertex AI agent Builder.

  • How many models are available in Vertex AI's model Garden?

    -Over 130 models are available in Vertex AI's model Garden, including both open source and closed source models.

  • What is unique about Gemini 1.5 Pro's capabilities?

    -Gemini 1.5 Pro offers the world's largest context window, supporting up to 1 million tokens, which allows it to process vast amounts of information in a single stream.

  • What are some of the new features being added to the Vertex AI agent Builder?

    -The Vertex AI agent Builder allows users to create humanlike conversations with text, voice, images, and video inputs, control conversation flow with natural language instructions, and improve response quality with vector-based and keyword-based search.

  • How does Google's new product, Google Vids, differ from traditional video creation tools?

    -Google Vids is an AI-powered video creation app for work that uses Gemini to assist with video writing, production, and editing, allowing users to create professional-looking videos with minimal effort.

  • What is the purpose of the customer agents built using Vertex AI?

    -Customer agents are designed to listen, understand needs, recommend products and services, and work seamlessly across various channels like web, mobile apps, point of sale, and call centers.

  • How does the AI assistant in the Mercedes-Benz example enhance the customer experience?

    -The AI assistant in Mercedes-Benz cars helps provide a personalized and intuitive experience, assisting with tasks like booking test drives, navigating through offerings, and optimizing marketing efforts.

  • What is the significance of the 1 million token context window in Gemini 1.5 Pro for developers?

    -The 1 million token context window allows developers to work with large codebases, understand complex details, and make informed decisions when modifying or enhancing code.

  • How does the Vertex AI agent Builder facilitate the creation of customer agents?

    -The agent Builder simplifies the process by allowing users to define goals, provide instructions, and integrate with various tools and databases to create powerful and responsive customer agents.

  • What are some of the limitations expressed in the transcript about Google's agent platform?

    -The limitations include a perceived lack of sophistication in the agent Builder, a preference for more future-thinking capabilities, and confusion about the integration and coding process within the platform.

  • What is the potential use case for Gemini's long context window in the workplace?

    -The long context window can be used to create employee agents that can perform tasks, accomplish work-related goals, and integrate with company data and web information to streamline operations.

  • How does the AI assistant in the workplace scenario help with benefits enrollment?

    -The AI assistant summarizes lengthy emails and videos, compares medical plans based on large amounts of data, and generates summaries in multiple languages, simplifying the benefits enrollment process for employees.

Outlines

00:00

🚀 Google Launches Vertex AI and Model Garden

Google has introduced an agent platform at Google Cloud Next 2024, which includes a fast-growing Enterprise AI platform called Vertex AI. The platform features a Model Garden with over 130 models, such as Gemini, Claude, and open models like Llama, Gemma, and MRR. The Model Garden is organized by modality and task, and users can access detailed information about each model, including use cases and documentation. Google also highlights the public preview of Gemini 1.5 Pro, which offers a large context window of up to 1 million tokens, allowing for processing vast amounts of information in a single stream. The platform also offers cross-modality analysis with the ability to process audio.

05:02

🤖 Customer Agents and Mercedes-Benz Partnership

Google discusses the concept of customer agents, which are similar to customer service representatives but are AI-driven. These agents can work across various channels, including web, mobile apps, point of sale, and call centers. Mercedes-Benz is highlighted as a partner that is integrating customer agents into their vehicles to enhance the digital experience. The CEO of Mercedes-Benz, Ola Källenius, discusses the company's efforts to personalize and improve the user experience through AI, including the use of Google Cloud AI for smart sales assistance, customer service optimization, and marketing. They are also exploring automated driving technologies with Google Cloud as a backbone for their product development.

10:02

🛠️ Introducing Vertex AI Agent Builder

Google introduces Vertex AI Agent Builder, a tool that allows users to create powerful customer agents in three key steps. The tool uses Gemini Pro for humanlike conversations, natural language instructions to control conversation flow, and search capabilities to connect internal information and the web. It also enables task completion for customers, such as booking flights or ordering food. The agent builder is designed to be user-friendly, allowing for simple integration of enterprise data and services. However, the reviewer expresses disappointment with the lack of sophistication in the agent builder, comparing it to custom GPTs from OpenAI and noting that it lacks the cutting-edge capabilities they were hoping for.

15:04

📺 Google Vids and AI-Powered Video Creation

Google announces a new app called Google Vids, an AI-powered video creation tool for work that integrates with Gemini and Vertex AI. Vids aims to simplify the process of creating videos for storytelling in the workplace. Users can input a prompt, and Gemini assists by suggesting a narrative outline, which can be customized and edited. Vids then generates a draft with animated scenes, stock media, music, and a script. The app is designed to be user-friendly and efficient, allowing users to create professional-looking videos quickly and with minimal effort.

20:05

💻 Gemini Code Assist for Developers

Google demonstrates Gemini Code Assist, a tool that helps developers make code changes efficiently. The tool is particularly useful for new developers who need to understand and modify large codebases. With Gemini's code transformations and full codebase awareness, developers can accomplish tasks that would typically take weeks in just minutes. The tool provides clear recommendations, ensures alignment with security and compliance requirements, and allows developers to apply edits while maintaining control over the process. The demonstration shows Gemini Code Assist adding a recommendation section to a homepage and making multiple edits across files, significantly reducing the time required for such updates.

Mindmap

Keywords

💡Google Cloud Next 2024

Google Cloud Next is an annual conference hosted by Google that focuses on its cloud computing services and related technology. In the context of the video, it is the event where Google announces the launch of their AI agent platform, which is a key theme of the video.

💡Vertex AI

Vertex AI is Google's enterprise AI platform that allows users to build, deploy, and scale machine learning models. It is central to the video's discussion as the platform where Google's new AI agent capabilities are being introduced and explored.

💡Model Garden

The Model Garden is a feature within Vertex AI that provides access to a variety of AI models, both open source and proprietary. It is highlighted in the video as a resource-rich component of Vertex AI, showcasing the diversity of models available for different applications.

💡Gemini 1.5 Pro

Gemini 1.5 Pro is an advanced AI model mentioned in the video, notable for its large context window supporting up to 1 million tokens. This model is significant as it enables processing vast amounts of information, which is showcased through examples like analyzing hour-long videos and large codebases.

💡AI Agent Builder

The AI Agent Builder is a tool within Vertex AI that allows users to create customer service agents. It is a key focus of the video, where the presenter discusses its capabilities and potential applications, such as building conversational interfaces for various tasks.

💡Multimodal Reasoning

Multimodal reasoning refers to the ability of an AI to process and understand information from multiple types of input, such as text, audio, and video. In the video, it is a feature of Gemini models that enables them to analyze and provide responses based on a broader context.

💡Code Assist with Gemini 1.5 Pro

Code Assist with Gemini 1.5 Pro is a feature that helps developers with coding tasks by leveraging the model's large context window. It is demonstrated in the video as a tool that can understand and suggest code changes, making developers more productive when working with large codebases.

💡Google Workspace

Google Workspace is a suite of productivity and collaboration tools offered by Google, including Gmail, Docs, Drive, and Calendar. The video discusses how AI agents are being integrated into Google Workspace to enhance user experiences and automate tasks.

💡Google Vids

Google Vids is a new app introduced in the video that is part of the Google Workspace suite. It is an AI-powered video creation app designed to simplify the process of creating videos for work, leveraging the capabilities of Gemini for narrative and design suggestions.

💡Customer Agents

Customer agents, as discussed in the video, are AI-driven tools designed to improve customer service by providing personalized assistance, recommendations, and support across various channels. They are a key application of the AI capabilities being developed by Google.

💡Enterprise AI

Enterprise AI refers to the integration of AI solutions into business processes to enhance efficiency, decision-making, and customer experiences. The video focuses on how Google's Vertex AI and its agent platform can be utilized in an enterprise context to achieve these goals.

Highlights

Google has launched an agent platform with Vertex AI, offering a fast-growing enterprise AI platform.

The Vertex AI Model Garden provides access to over 130 models, including the latest versions of Gemini and Claude from Anthropic.

Google Cloud is enhancing Gemini 1.5 Pro with a 1 million token context window, allowing processing of vast amounts of information.

Google is working on internal models with 10 million token context windows, which will enable new use cases like analyzing hour-long videos and large codebases.

The platform is also introducing the ability to process audio, enabling cross-modality analysis for tasks like searching within audio and video content.

Google is releasing Code Gemma, a fine-tuned lightweight open model designed for coding, based on the same technology used to create Gemini.

Vertex AI offers a single platform for model tooling and infrastructure, with customer agents built using generative AI.

Mercedes-Benz is collaborating with Google Cloud to enhance the digital experience in their cars, including using AI for personalized and intuitive experiences.

Google is introducing Vertex AI Agent Builder, allowing users to create powerful customer agents through three key steps.

The Agent Builder allows for the creation of free-flowing human-like conversations and can be personalized with custom voice models.

Google Workspace is integrating AI capabilities, such as summarizing emails and videos, and comparing medical plans using multimodal reasoning.

Google Vids is a new AI-powered video creation app for work that uses Gemini to assist with video writing, production, and editing.

Gemini Code Assist enables new developers to make complex codebase changes quickly, such as modifying a recommendation service for a homepage feature.

Google's advancements in AI are aimed at making it easier for non-technical users to create and manage AI-driven applications.

The integration of AI across Google's products, such as Google Docs and Google Chat, is designed to streamline and enhance user productivity.

Google's strategy with AI seems to focus on making existing products smarter and more integrated rather than creating entirely new categories of products.

The speaker expresses disappointment with the lack of a more sophisticated agent framework, preferring a more future-thinking approach.

Transcripts

00:00

all right so Google finally launched an

00:02

agent platform and we're going to take a

00:05

look at the announcement right now so

00:07

this is from Google Cloud next 2024

00:09

keynote speech I did a super cut of it

00:11

but now I want to talk more specifically

00:13

about it and we're going to watch it

00:14

together and in a video that I have

00:16

planned I'm going to show you how to use

00:18

the vertex AI agent Builder yourself and

00:20

I've been playing around with it and

00:22

it's pretty cool all right so let's

00:24

start with the keynote now let's dive

00:26

into vertex AI our fast growing

00:29

Enterprise AI platform in our vertex AI

00:32

model Garden you can access over 130

00:36

models including the latest versions of

00:39

Gemini par models like Claude from

00:42

anthropic and popular open models

00:44

including llama Gemma and mrr all right

00:47

so first uh the model Garden which seems

00:50

pretty cool they have a bunch of

00:52

different models that you can use both

00:53

open source and closed source and in

00:55

fact let me just show it to you all

00:57

right so here it is here's the model

00:58

garden and we can see it has Gemini

01:01

imagine Gemma chirp here's Gemini 1.5

01:04

Pro so it is really cool that they have

01:06

all these models in the same place

01:08

here's stable diffusion Laura they have

01:10

it filtered by modality so language

01:12

Vision tabular document they also have

01:14

it filtered by task so generation

01:16

classification Etc so if we click into

01:18

Gemini 1.5 Pro we can see all the

01:21

information about it use cases

01:23

documentation this feels like hugging

01:26

face and it's interesting because

01:28

hugging face actually showed up at the

01:30

Google Cloud next keynote but I guess

01:32

they don't see this as competitive and

01:34

you can open in vertex AI studio and so

01:37

that's where you can start playing

01:38

around with it and here is all the

01:40

models that they have as they said so

01:42

here's llama 2 Claude 3 stable diffusion

01:46

mixol 8 x 7B wizard coder I mean they

01:49

really have a ton of the top models so

01:52

really really cool all right let's keep

01:54

watching you choose the best model for

01:56

your use case budget and performance

01:58

needs and switch between models as you

02:01

need to get today we're taking Gemini

02:05

1.5 Pro into public preview all right so

02:09

this is pretty cool Gemini 1.5 Pro in

02:11

public preview I've had access to it for

02:13

a while I've been playing around with it

02:15

having a million token context window is

02:17

absolutely insane being able to drop an

02:19

hourlong video into a prompt and it

02:21

answer questions about that video is

02:24

kind of mind-blowing there's an example

02:26

in there where you can load up a movie

02:27

and ask it a question about what was on

02:30

some like note that somebody took out of

02:31

their pocket in a scene that maybe

02:33

lasted just a couple dozen frames really

02:36

really impressive stuff all right let's

02:38

keep watching Gemini offers the world's

02:41

largest context window would support for

02:44

up to 1 million tokens with Gemini 1.5

02:48

Pro customers can now process vast

02:52

amounts of information in a single

02:54

stream all right I want to pause for a

02:56

second they're talking about 1 million

02:58

tokens but it has already leaked that

03:01

they have 10 million token context

03:03

Windows internally that they're working

03:04

on these massive context windows are

03:06

going to open up brand new use cases and

03:09

I'm super excited to see how well they

03:11

work including 1 hour video 11 hours of

03:15

audio code basis will well over 30,000

03:20

lines of code I mean that is a monster

03:23

use case being able to have 30,000 lines

03:25

of code in a single context window is

03:28

really really impressive now of course

03:30

most mature code bases are well over

03:32

30,000 lines of code so there's still

03:34

going to be a need for mapping out code

03:37

bases using rag Solutions like pine cone

03:40

so we're still very far away from being

03:42

able to put an entire codebase in a

03:44

single prompt over

03:47

700,000 words we're enhancing Gemini 1.5

03:51

Pro with the ability to process audio

03:54

enabling cross modality analysis for

03:58

instance you can use it to search

04:00

in audio and video content for example

04:03

find a timestamp in a baseball game

04:06

video where a commentator says it's out

04:09

of here we've seen some amazing examples

04:13

of what people can do with this large

04:14

context window Sunda mentioned a few and

04:17

others include a university Professor is

04:21

using it to extract data from a 3,000

04:25

page document with texts data tables and

04:28

charts in just a a single shot yeah

04:31

that's probably one of the coolest use

04:32

cases just being able to load up huge

04:34

PDFs huge documents and being able to

04:37

summarize them easily extract

04:39

information from them accurately I'm

04:41

really excited about a million token

04:43

context window and he also mentioned

04:46

audio which is really cool I can load up

04:48

an hour and a half long podcast and ask

04:51

questions about it and it'll give me

04:53

answers based on the context of that

04:55

podcast so very very cool okay so I'm

04:57

just going to skip ahead a little bit

04:58

let's keep watching we're also

05:00

announcing the availability of code

05:01

Gemma a fine-tuned lightweight open

05:05

model designed for coding from the same

05:08

technology used to create Gemini all

05:10

right so I've used Gemma and frankly it

05:13

was very unimpressive but I know they

05:15

just released a new version of Gemma so

05:17

I definitely have it on my list to test

05:19

out and look I am appreciative of any

05:21

company that is releasing open source

05:23

model so thank you to Google for

05:25

releasing Gemma and now maybe I need to

05:27

test code Gemma because now they have a

05:29

finetuned version of Gemma specific for

05:32

code let's keep watching with these

05:33

additions Google Cloud continues to be

05:37

the only cloud provider to offer widely

05:39

used first party third parties and open-

05:43

Source

05:44

models vertex AI can be used to tune

05:47

augment manage and monitor these Models

05:50

All right so yeah I mean Google's really

05:52

getting in the game now I'm impressed

05:53

with all of these announcements their

05:55

model builder allows you to fine-tune

05:57

allows you to do a whole bunch of stuff

05:58

with the models but what I really want

06:00

to know about and what I really want to

06:02

talk about today is their agent

06:03

framework so I'm going to skip ahead and

06:05

we're going to take a look at that Vex

06:07

AI is the only AI platform to provide a

06:10

single platform for model tooling and

06:14

infrastructure now let's look at the

06:16

types of Agents customers are building

06:19

on Google Cloud using generative AI all

06:22

right so now they're going to be talking

06:24

about customer agents and when I hear

06:26

agent I think about autogen I think

06:29

about crew aai I think about agents that

06:32

are coded given tools given

06:34

personalities given backgrounds that can

06:36

work together to accomplish and automate

06:38

tasks I think when Google is talking

06:40

about agents they're mostly talking

06:43

about customer service agents this feels

06:45

very similar to open ai's Assistance or

06:47

their custom gpts product it doesn't

06:50

feel like a fully featured agent

06:52

framework to me at least not yet but

06:55

let's take a look and see what they say

06:57

and I'm also going to show you a little

06:59

bit of the interface itself first

07:02

customer agents you know similar to

07:04

great sales and service people customer

07:07

agents are able to listen

07:10

carefully understand your needs

07:12

recommend the right products and

07:14

services they work seamlessly across all

07:17

your channels the web your mobile app

07:20

your point of sale and your call center

07:23

and they can be integrated into product

07:27

experiences with voice and video video

07:31

mercedesbenz is working with us on

07:33

customer agents to help people in their

07:36

amazing cars let's hear from their CEO

07:39

Ola Kines at Mercedes-Benz we want to

07:43

offer our customers an exceptional

07:45

digital experience that's why we're

07:48

equipping our cars with high-end

07:51

computers each car should only get

07:53

better over time just like a good wine

07:56

and with the power of Google cloud and

07:58

AI we will make the user experience even

08:01

more personalized our partnership across

08:04

Google helps us build more intuitive and

08:07

customized experiences last year we

08:10

announced our partnership with Google

08:12

Maps and today more than 3 million

08:15

customers are using Google places in

08:18

their Mercedes cars and we are applying

08:20

Google Cloud AI across a number of other

08:23

use cases ranging from a smart sales

08:26

assistant improving customer service in

08:28

our call centers

08:30

and optimizing our marketing the sales

08:32

assistant for example helps customers to

08:35

seamlessly interact with Mercedes when

08:38

booking a test drive or navigating

08:40

through mercedes's offerings to find

08:42

their next favorite vehicle and now

08:45

we're exploring further opportunities to

08:48

work with Google Cloud AI such as Next

08:51

Level navigation features in addition

08:54

we're partnering on one of the most

08:56

exciting technology Topics in our

08:58

industry automated driving this

09:01

beautiful car right here is equipped

09:03

with a level three system for

09:05

conditionally automated driving we were

09:07

the first manufacturer to get it

09:09

certified in Germany California and

09:12

Nevada for our next Generation internal

09:14

development and test platform we will

09:17

use Google Cloud as the backbone helping

09:20

us to become even more efficient and

09:23

flexible in our product development and

09:25

Google Cloud's expert knowledge in

09:28

processing massive amounts of data and

09:30

scaling AI workloads will ensure that

09:33

our cars get even more intelligent and

09:36

AI driven partnering with the very best

09:39

in their respective Fields is an

09:41

important part of our software strategy

09:44

and Google is the perfect example of

09:47

that with Google Cloud Mercedes-Benz is

09:50

building new ways to deliver the most

09:53

intelligent vehicles to our customers

09:56

and to create personalized intuitive

09:58

experience

09:59

we're really excited about working

10:02

together thank you for having me okay

10:05

this is the biggest missed opportunity

10:07

I've ever seen why isn't there an agent

10:09

built into to the infotainment system in

10:12

the Mercedes that seems like the most

10:14

obvious use case when you're driving you

10:16

can't use your hands to text or type or

10:18

search or do anything you could simply

10:21

be talking to an agent to accomplish all

10:23

of these different things for you I

10:25

don't know why they wouldn't have done

10:27

that I'm very surprised to see that they

10:29

just skipped over that super obvious and

10:32

super valuable use case we're inspired

10:34

by the agents that customers are

10:35

creating using a gen generative AI

10:39

platform and all right so a lot of good

10:41

brands on here ADT Verizon Target

10:44

discover Best Buy Etc and they're all

10:47

building agents but I think they're all

10:49

basically just customer service Bots

10:51

which is pretty disappointing that's the

10:54

most easy obvious simple use case and I

10:57

really think it speaks to how safe

11:00

Google is playing it or maybe they're

11:02

just thinking about it at the Enterprise

11:03

level but there's really some

11:05

cuttingedge stuff they could be doing

11:08

which I wish they were our models

11:10

InterContinental Hotels group will

11:12

launch a travel planning capability to

11:15

help each of you their guests plan their

11:18

next vacation ADT is building an agent

11:21

to help customers select and set up home

11:25

security systems Verizon gives agents

11:27

better recommendations so these all seem

11:30

like customer facing Bots whether it's

11:31

customer service or sales and that's

11:34

fine that there's definitely a lot of

11:36

money in those use cases but that's not

11:38

as exciting to me magalo one of Brazil's

11:41

largest retailers has put generative AI

11:45

right at the heart of its customer

11:47

service ing built a chatbot to enhance

11:51

self-service and improve answer quality

11:55

and Target uses AI on the Target app and

11:58

website and by the way I just want to

12:00

point out Google has had a product that

12:03

does all of this for a very long time my

12:06

previous company used it it was called

12:08

dialogue flow and it still is a product

12:11

within the Google cloud services Suite

12:13

but it was very brittle it was very hard

12:15

to set up so I understand why they're

12:17

kind of relaunching these capabilities

12:19

but still I'm a little disappointed that

12:21

they're not more future thinking in

12:23

their capabilities Minnesota's

12:25

Department of Public Safety helps

12:28

non-english speakers get licenses and

12:30

other services with real-time

12:33

translation Best Buy is building an

12:36

assistant that will help troubleshoot

12:39

product issues reschedule or combine

12:42

order deliveries or manage

12:45

software discover Financial Services is

12:49

using search and synthesis across

12:52

detailed policies and procedures during

12:55

customer service calls and oranges fr

12:59

French language agent is grounded in

13:02

support knowledge transforming their

13:04

help and contact site and their customer

13:07

experience Oppo and OnePlus leaders in

13:11

smart devices are incorporating our

13:14

Gemini models and Google Cloud AI into

13:17

their phone to deliver Innovative

13:19

customer experiences including news

13:22

audio recording summaries AI toolbox and

13:25

much much more you know the opportunity

13:28

for customer customer agents is

13:30

tremendous to help each of you build

13:32

customer agents faster we're introducing

13:35

vertex AI agent Builder you can now

13:39

create customer agents that are

13:42

amazingly powerful in just three key

13:45

steps all right so this is really what

13:48

the agent Builder is it is not to the

13:51

level of sophistication of an autogen or

13:53

a crew AI it's really just a product

13:56

that seems very similar to custom gpts

13:59

from open AI first you can use Gemini

14:02

Pro to create free flowing humanlike

14:07

conversations with text voice images and

14:11

video as inputs and personalize them

14:15

with custom voice

14:17

models second you can use natural

14:20

language instructions to control the

14:23

conversation flow and guide it on

14:26

specific topics you don't want it to

14:28

discuss such such as current events in

14:30

the same way that you train your human

14:33

agents you can also control when it

14:35

hands over to a human agent with

14:39

transcription and summarization of its

14:41

conversation history to make these

14:44

transitions extremely smooth third you

14:47

can improve response quality with

14:49

vector-based and keyword-based search to

14:52

connect your internal information and

14:55

the entire web you can also use

14:58

extensions to complete tasks for

15:01

customers like updating contact

15:03

information booking a flight ordering

15:05

food and many more and you can integrate

15:09

Enterprise data from operational

15:12

databases like

15:13

allb Predictive Analytics with big quy

15:16

and SAS applications like service now

15:20

let's take a look at an example of a

15:22

customer agent in action please welcome

15:26

developer Advocate Amanda Lewis

15:30

thank you

15:32

Thomas so last night I was watching a

15:35

video of this band and I love the

15:38

keyboard player shirt so I was thinking

15:41

I'd really like to be wearing that shirt

15:43

tomorrow night but can I find it in my

15:45

size and in time to be rocking it at the

15:49

concert here in

15:51

Vegas let's head over to my favorite

15:53

store oh this is uh so scripted and

15:56

Polished it's a little bit cringy they

15:59

just launched a customer agent and it

16:01

leverages Gemini and Vector search to

16:04

deliver a seamless shopping experience

16:07

all right I I can't get over it I I just

16:09

I don't want these types of products

16:11

personally I know they're valuable but

16:13

they're out there these have already

16:15

existed for a while and they're talking

16:17

about it like it's so Cutting Edge

16:19

customer shopping assistance customer

16:21

support agents sales agents it's not

16:23

interesting to me so let me play the

16:25

rest of this demo and then I'm actually

16:27

going to show you vertex really quickly

16:28

and and you're going to understand why

16:30

I'm a little bit disappointed with

16:31

Google's announcements today what can we

16:33

help you find well I'd like that shirt

16:37

but I guess I have a few other

16:38

specifications as well so find me

16:42

a checkered shirt like the keyboard

16:49

player is

16:52

wearing I'd like to see

16:56

prices where to buy it

16:59

and how

17:00

soon can I be wearing it going to

17:04

include the

17:07

video now the customer all right that's

17:10

cool I'll give him credit for that being

17:12

able to just drop a video and say tell

17:14

me where I can buy the shirt that that

17:16

person's wearing that is really really

17:19

cool although again it's just for the

17:21

shopping use case I would have liked to

17:23

see something a little bit more future

17:25

thinking agent is using Gemini's

17:28

multimodal reasoning to analyze the text

17:30

and video to identify exactly what I'm

17:33

looking for then Gemini turns it into a

17:36

searchable format how cool is this it

17:39

found the checkered shirt I'm looking

17:41

for right and some other great options

17:44

in no time and that's because these

17:47

results harness Google's trusted search

17:49

Technologies which ensures customers

17:52

like me get the right results in record

17:54

time the suggested products are grounded

17:57

in Syle Fashion's inventory and

17:59

historical performance data to make sure

18:01

customers leave happy and with that

18:03

purchase in hand okay so I'm going to

18:05

pause there let me show you vertex aai

18:07

agent Builder now all right so this is

18:10

their agent Builder I just want to show

18:11

it to you quickly I'm going to make a

18:12

full video all about it but I I want to

18:14

show it to you because it's really

18:15

telling about how Google is thinking

18:18

about agents and it's not how I think

18:20

about agents so over here we can create

18:22

a new agent I've already created one

18:24

weather agent we'll click into it and

18:27

you give it a name you give it goal and

18:29

then you can give it instructions one

18:31

thing that I really do like about it is

18:34

that the instructions can be very simple

18:36

and you simply can just list them like

18:39

this ask the user for their location and

18:41

then use and then anytime you have a

18:43

dollar symbol right there you can easily

18:45

insert agents or tools that interface is

18:49

very very nice so I simply say ask the

18:51

user for their location use tool weather

18:53

and the tool weather is one that I've

18:55

already created let me show you over

18:56

here we have our tools okay so I created

18:58

this weather tool I have it as type

19:01

function I have no description but you

19:03

don't need one and then you simply have

19:05

the input parameter schema and the

19:07

output parameter schema here's where I'm

19:09

really confused where's the actual code

19:11

go I don't see a place to put code

19:13

anywhere you can put input parameters

19:15

and output parameters but how do you

19:17

actually say Okay I want to hit this

19:18

third party API and this is actually one

19:20

of the samples that they give and I just

19:22

don't understand it if you do let me

19:24

know in the comments but basically where

19:26

do I actually put it so let's see how it

19:28

responds okay so I have the weather

19:31

agent selected right here let's test it

19:32

out what's the weather in Los Angeles it

19:36

formatted it properly we have the tool

19:38

input Fahrenheit Los Angeles California

19:40

and then the output temperature zero

19:42

where does it actually get the

19:43

temperature from submit function output

19:45

I'm sorry I can't provide weather

19:46

information this is literally the

19:48

example that they provide in the

19:50

dashboard it's very confusing it's

19:52

definitely not how I think about agents

19:54

but they're making progress and so I

19:56

appreciate their efforts so far one

19:58

thing that I do want to show you that's

19:59

really cool is you can easily have all

20:01

of these Integrations by the way here's

20:03

dialog flow messenger which is that

20:05

product that I just told you about which

20:06

is kind of their previous iteration of

20:08

their agent framework but you can

20:10

integrate twillo Discord all of these

20:13

really easily which is super nice but

20:16

these are basically just tools and so

20:18

yeah that is the entire vertex AI agent

20:20

Builder it is essentially just custom

20:23

gpts by open AI so we'll go create you

20:26

can list tools and agents and it has a

20:28

code interpreter you can also add other

20:30

tools here but again I don't really

20:32

understand how tools work and so I think

20:35

this is the code you basically have to

20:37

format it in this yaml or Json format

20:40

rather than kind of just pasting in

20:42

python or whatever language you're most

20:44

familiar with which is okay it's not

20:46

great the thing I like about it is it

20:47

does have built-in authentication which

20:49

is nice and makes it really easy and you

20:52

can also have TLS certificates right

20:54

there but definitely not straightforward

20:56

to use and I would prefer simp simply

20:58

just defining a method here and allowing

21:00

the agents to call that function

21:03

whenever they need it all right so now I

21:06

think they're starting to get into

21:07

something more interesting which is

21:09

agents in the workplace meaning agents

21:12

that can actually perform tasks and

21:14

accomplish things essentially kind of AI

21:17

employee so let's take a look first you

21:20

create a custom model in the ways that

21:22

we've shown before from there you

21:25

connect them to all your company and web

21:28

data

21:29

this can also be done with translation

21:31

so that your company information is

21:33

available regardless of language

21:35

similarly we support multimodal inputs

21:38

including videos call Audio images in

21:42

addition to text now you will want to

21:44

ground that in Enterprise truth using

21:48

databases like alloy DB big query and

21:51

data from Enterprise apps like sap and

21:54

announcing today

21:57

HubSpot let's take a

21:59

interesting that they're mentioning

22:00

HubSpot because it is rumored that

22:02

Google is going to acquire HubSpot

22:04

although it is just a rumor right now

22:06

and that's pretty cool that you can

22:08

actually feed in all of your HubSpot CRM

22:10

data into the agent so let's keep

22:12

watching at an example of an employee

22:15

agent in action please welcome developer

22:18

Advocate Gabe

22:19

Vice thanks

22:22

Lea hi folks so I know you all want to

22:26

hear about awesome AI stuff that's

22:27

coming but I need to talk to you for a

22:29

minute about my annual benefits

22:31

enrollment see I forgot I have to finish

22:33

signing up by today and as you can see I

22:36

might be a little bit busy so if you

22:38

don't mind let's go ahead and look at

22:39

this open enrollment email together okay

22:41

yep I've got a deadline I knew that

22:43

thank you I've got FSA stuff I've got an

22:46

online portal from my company okay

22:48

there's a lot here uh H they included

22:50

video let's see if this makes my life

22:52

easier ah okay so it's almost an hour

22:54

long yeah I'm not going to have time to

22:56

review all of this stuff let's see how

22:58

this employee agent that we've developed

23:00

using Google workspace Gemini models and

23:02

vertex AI might be able to help me as

23:05

you can see it's integrated directly

23:06

into my Google Chat so I don't have to

23:08

context switch while I'm figuring all

23:09

the stuff out first things first let's

23:12

have it summarize the email and the

23:14

video that it sent me all right that's

23:16

awesome I have been wanting to build an

23:17

automation using AI that can read an

23:20

email look at all the context from that

23:23

thread and then all of the context of

23:25

all of my emails to try to write a draft

23:28

that I can simply either edit or send

23:31

and that is kind of my dream cuz I get a

23:33

ton of emails I wish I had that and I

23:35

think that's where they're headed with

23:36

this product summarize the body and

23:39

attached video from my recent email with

23:44

subject open

23:46

enrollment

23:47

closing so behind the scenes the agent

23:50

is referencing that email body and its

23:52

attachments as context in the prompt

23:54

using retrieval augmented generation

23:57

that is awesome awesome okay that is

23:59

very very cool that way its response is

24:01

limited to the content that matters to

24:03

me the Gemini model's multimodal

24:05

capabilities allows the agent to

24:08

understand and reason across text audio

24:11

and video from a single prompt I mean

24:13

this is a way quicker read okay good and

24:16

I can immediately see that the medical

24:17

plants have been completely revamped

24:19

this year let's go ahead and jump into

24:21

the benefits portal to see more now I've

24:24

already done my dental and my vision but

24:26

I procrastinate I mean save

24:29

the most important plan for last my

24:31

medical plan let's see how this option

24:34

Stacks against my existing coverage

24:36

compare these coverage all right that's

24:39

really cool that you can basically just

24:41

invoke a Google drive folder or a Google

24:44

Drive agent I think and then ask it

24:47

additional information I'm very

24:48

impressed with that by the way I didn't

24:50

see anywhere in the vertex AI agent

24:53

Builder where I could accomplish

24:54

something like this I think this is all

24:56

just built in by Google behind the

24:59

scenes into their products this isn't

25:01

something that you'll be able to build

25:02

but we'll see options to the PDF doc I

25:07

have on the Platinum

25:10

plan the Gemini model's long context

25:12

window paired with vertex extensions

25:15

enables the agent to cross reference

25:17

large amounts of data from a variety of

25:19

sources including unstructured data like

25:22

PDFs leveraging Gemini's Advanced

25:24

reasoning capabilities the agent is able

25:27

to understand the complex details my

25:29

current plan and compare it with the new

25:31

options for 2025 and since the

25:33

Enterprise grounding features links me

25:35

to the exact data that Gemini used to

25:38

draw its conclusions which you can see

25:40

linked here I can confidently trust its

25:42

recommendation that the gold plan is

25:44

best for me and done so now let's get a

25:49

summary of my coverage let's say my

25:51

house is multilingual so I'd like to

25:53

have it in Japanese also please generate

25:57

a summary of 2025 benefits in a Google

26:01

doc in both English and

26:05

Japanese although my source material is

26:07

in English the Gemini model support for

26:09

over 40 languages enables it to

26:12

understand and respond in Japanese and

26:15

here we go all right this is cool again

26:18

but again this is all stuff that's built

26:21

into the Google workspace product so

26:23

very cool I'll definitely be using all

26:25

of this but I wish they kind of added a

26:27

lot of functionality into the agent

26:30

Builder that I could use now that I've

26:32

officially completed enrollment my

26:34

daughter's going to need braces this

26:35

year I'm going to skip over this I get

26:37

the demo fine the agent knows that I'm

26:39

at Google Cloud next because it's

26:41

integrated with yeah so essentially now

26:44

you have a personal agent to do

26:46

everything for kind of your work Gmail

26:48

Google Docs calendar very cool I'll

26:51

definitely be using it so the next thing

26:52

that they're going to talk about is a

26:53

new product in their Google Suite or

26:55

their Google Docs Suite of products so

26:58

they have docs spreadsheets they have

27:00

presentations and now they're going to

27:01

add video which is really cool let's

27:03

take a look we believe that everyone can

27:06

be a great Creator and a great

27:09

Storyteller but the formats and tools

27:11

for storytelling at work haven't really

27:13

changed that

27:15

much how many times have you heard

27:17

should we start with a dock or a

27:19

deck well we can do a lot

27:23

better I'm absolutely thrilled to

27:26

announce our newest workpace app Google

27:36

vids sitting alongside Google Docs

27:39

sheets slides Google vids is an AI

27:42

powered video creation app for work with

27:46

Gemini in bids you have a video writing

27:48

production and editing assistant

27:51

allinone let me show you how simple it

27:54

is to get started with

27:56

bids now after week with all of you here

27:59

at next I'm going to want to share a

28:01

recap video to share all the excitement

28:04

with my

28:05

organization when I open up vids Gemini

28:08

helps me get started I simply type in a

28:11

prompt using an existing document for

28:15

context all right that's really cool

28:17

that you can pass in context that easily

28:19

so I'm very impressed that everything

28:21

that they're releasing with their kind

28:23

of workspace agents seems to be very

28:26

integrated with itself which is to be

28:28

expected but it is very cool now based

28:30

on that prompt Gemini suggests a

28:33

narrative outline for the story that I

28:36

could easily customize and

28:38

edit I choose an expressive style and

28:42

vids Works its magic so wow just like

28:45

that I get the first draft with

28:46

beautifully designed fully animated

28:48

scenes complete with relevant stock

28:51

media and music and even a generated

28:54

script yeah all right that's very cool I

28:56

wonder where it's pulling the ated stock

28:58

media so it's not actually creating

29:01

video AI video but it is kind of pulling

29:04

together different b-roll and different

29:06

title sections uh and it's kind of

29:08

putting the whole thing together so

29:10

pretty impressive all right so this is

29:12

something I'm really excited about uh

29:14

actual agents being able to code with

29:16

you and I'm hopeful this is going to be

29:18

really cool because of Gemini's massive

29:20

context window so let's watch this video

29:23

so let's take a look at what's coming

29:25

for code assist with Gemini 1.5 Pro

29:28

leveraging a 1 million token context

29:31

window I'm a new developer with symbol

29:34

Outfitters and today we show recommended

29:36

products to customers only after they've

29:39

made an initial

29:41

selection these suggestions are powered

29:43

by our custombuilt recommendation

29:45

service based on previous

29:47

purchases but now the marketing

29:50

department has asked me to move this

29:52

feature to our homepage so that

29:55

customers can see products that they

29:56

might be interested in as as soon as

29:58

they get to our

29:59

site our design department has created a

30:02

mockup of what they would want this

30:04

experience to look like in figma and for

30:07

the developers out there you know that

30:09

this means we're going to need to add

30:10

padding in the homepage modify some

30:13

views make sure that the configs are

30:15

changed for our

30:16

microservices and typically it would

30:19

take me a week or two to even just get

30:20

familiarized with our company's code

30:22

base which has over a 100,000 lines of

30:25

code across 11 services

30:29

but now with Gemini Cod assist as a new

30:32

engineer on the team I can be more

30:35

productive than ever and can accomplish

30:37

all of this work in just a matter of

30:40

minutes this is because Gemini's code

30:43

Transformations with full codebase

30:45

awareness allows us to easily reason

30:48

through our entire

30:50

codebase and in comparison other models

30:53

out there can't okay so this looks like

30:55

VSS code which is kind of interesting

30:58

given this is Google but I guess this is

31:00

built into V code this is some kind of

31:02

extension I'm not sure let's keep

31:04

watching handle anything beyond 12 to

31:06

15,000 lines of code and even then they

31:09

struggle to get it

31:11

right Gemini inside of code assist is so

31:15

intelligent that we can just give it our

31:17

business requirements including the

31:19

visual design so let's ask here I am

31:24

prompting Gemini to add a for you

31:26

recommendation section on the homepage

31:28

all right and again very very cool that

31:30

you can just drop a Google Drive link

31:32

right into Gemini and it will grab that

31:35

context so I'm impressed by their

31:37

ability to just essentially drop any

31:40

source of information at any time into

31:42

Gemini along with an image of the future

31:45

state to show the improved design almost

31:47

immediately Gemini code assist starts by

31:50

reasoning about the code changes that it

31:52

needs to make and has insights an

31:54

experience teammate would have for

31:57

example because we asked Gemini Cod

31:59

assist to change the recommendation

32:02

service it was able to find the

32:04

recommendation function and extract out

32:07

the exact details needed to make the

32:09

call to the recommendation

32:11

service it highlights the files needing

32:13

to be changed and reveals the reasoning

32:16

behind its recommendations using our own

32:18

codebase for

32:20

context Gemini Cod assist doesn't just

32:23

suggest code edits it provides clear

32:26

recommendations and make sure that all

32:28

of these recommendations are aligned

32:30

with symbol Outfitter security and

32:33

compliance

32:34

requirements in code assist we've also

32:37

added an option to apply the edit which

32:41

keeps me as the developer and the driver

32:44

seat so let's take a look at the source

32:47

code changes that Gemini code assist has

32:49

made in our code

32:51

base it looks like we have multiple

32:53

edits across two files handlers. Go

32:58

and also

32:59

home.html Gemini cist even applied these

33:02

changes to the full

33:04

repository and to put this in context no

33:08

pun intended it would have taken me over

33:11

70 hours nonstop to even just read

33:14

through all of these files all right I

33:17

think that's kind of a little bit of BS

33:19

marketing talk because you don't

33:21

necessarily have to read through every

33:23

single file every single line of code to

33:26

actually make modifications the code

33:28

base but fine I understand what she's

33:30

saying just like I would with any code

33:32

change my next step is to check the

33:34

workout by testing out the modified app

33:37

locally so let's try it and there we go

33:42

the for you recommendation section is

33:44

exactly what our marketing team was

33:46

asking for all right so very cool and

33:49

this is a very simple marketing page

33:51

that they're updating so it's kind of a

33:52

simple use case but I'm excited to try

33:54

it out anything with AI encoding you

33:56

know I'm all about I'll definitely make

33:58

a video about that as well so I think

34:01

I'm going to call this video right here

34:02

Google announced some really cool stuff

34:04

I wish the agent Builder would have been

34:07

more sophisticated but overall all of

34:09

the functionality that they're adding

34:11

into the Google workspace product is

34:14

very welcome so if you liked this video

34:16

please consider giving a like And

34:17

subscribe and I'll see you in the next

34:19

one

Rate This

5.0 / 5 (0 votes)

Связанные теги
Google CloudVertex AIAI AgentMultimodalEnterprise AIGemini ProAI AssistantCustomer ServiceVideo CreationGoogle VidsWorkspace AppAI DevelopmentCode AssistAutomationAI ToolsCloud ComputingAI IntegrationProductivityTech Innovation
Вам нужно реферат на русском языке?