A Tableau Alternative in Python for Data Analysis (in Streamlit & Jupyter) | PyGWalker Tutorial

Coding Is Fun
18 Jun 202305:22

Summary

TLDRPyGWalker is a Python library that allows users to transform Pandas dataframes into interactive Tableau-like interfaces for data visualization and analysis. Installation is straightforward via pip, and the library can be integrated into Jupyter Notebooks or Streamlit apps. Users can explore datasets by dragging and dropping fields onto axes, apply filters, and switch between chart types. The tool also supports exporting visualizations, dashboard code, and settings, making it a versatile tool for quick exploratory data analysis.

Takeaways

  • ๐Ÿ“ฆ Install PyGWalker with 'pip install pygwalker' to integrate it with your Python environment.
  • ๐Ÿ“ˆ PyGWalker allows conversion of Pandas dataframes into interactive Tableau-like UIs for data analysis.
  • ๐Ÿ“Š The library can be used within Jupyter Notebooks or Streamlit apps for data visualization.
  • ๐Ÿ” The 'pyg.walk' function is used to explore dataframes with intuitive drag-and-drop features.
  • ๐ŸŽจ Users can choose between dark and light modes for the UI.
  • ๐Ÿ“‘ The data tab provides direct inspection of the dataframe and its entries.
  • ๐Ÿ“Š The visualization tab enables users to create charts by dragging fields onto x and y axes.
  • ๐Ÿ”ง Resizing and chart type selection can be done through the UI for better customization.
  • ๐Ÿ”‘ Filters and multiple fields can be applied to both axes for in-depth data analysis.
  • ๐Ÿ–ผ๏ธ Charts can be exported in scalable vector graphic (SVG) format for high-quality images.
  • ๐Ÿ“‹ The 'Export As' feature allows users to save their dashboard configurations as code or a JSON file for reuse.

Q & A

  • What is the name of the Python library introduced in the transcript?

    -The Python library introduced is called PyGWalker.

  • How can you install PyGWalker?

    -You can install PyGWalker by running the command 'pip install pygwalker' in your command prompt or terminal.

  • What is the primary function of PyGWalker?

    -PyGWalker allows users to turn a Pandas dataframe into a Tableau-style User Interface for visual data analysis.

  • Can PyGWalker be integrated into Jupyter Notebook or Streamlit apps?

    -Yes, PyGWalker can be integrated into both Jupyter Notebook and Streamlit apps.

  • What dataset is used for demonstration in the transcript?

    -The 'tips' dataset is used for demonstration purposes in the transcript.

  • How does PyGWalker categorize data in the visualization tab?

    -PyGWalker automatically labels numeric values as measures and the rest as dimensions in the visualization tab.

  • What types of chart options are available in PyGWalker?

    -PyGWalker allows users to choose from different chart types, including bar charts and line charts.

  • How can you apply filters in PyGWalker?

    -Filters can be applied in PyGWalker by dragging the column you want to filter into the filter field and selecting the desired criteria.

  • Is it possible to export charts created with PyGWalker?

    -Yes, charts created with PyGWalker can be exported, for example, to a scalable vector graphic (SVG).

  • How can you share or replicate the PyGWalker dashboard setup?

    -The PyGWalker dashboard setup can be shared or replicated by exporting the configuration as code or as a 'config.json' file.

  • How does the transcript demonstrate integrating PyGWalker with a Streamlit app?

    -The transcript shows that after setting up the Streamlit app with basic configurations and loading the 'tips' dataset, the content from the 'config.json' file can be read to use PyGWalker with the previous settings within the app.

  • Where can viewers find the Streamlit app and Jupyter Notebook demonstrated in the transcript?

    -The Streamlit app and Jupyter Notebook will be uploaded to the presenter's GitHub repo, with the link provided in the description box of the video.

Outlines

00:00

๐Ÿš€ Introduction to PyGWalker

This paragraph introduces PyGWalker, a new Python library that enables users to convert Pandas dataframes into interactive Tableau-style user interfaces for data visualization and analysis. It mentions the ease of installation via pip and the ability to integrate with Jupyter Notebook or Streamlit apps. The paragraph also provides a brief overview of the 'tips' dataset used for demonstration and explains how to use PyGWalker to explore the dataset with its intuitive interface.

Mindmap

The video is abnormal, and we are working hard to fix it.
Please replace the link and try again.

Keywords

๐Ÿ’กPyGWalker

PyGWalker is a Python library that enables users to convert Pandas dataframes into interactive Tableau-style user interfaces for data visualization and analysis. It is designed to be integrated into Jupyter Notebooks or Streamlit apps, offering a more intuitive way to explore and analyze data without the need for extensive plotting. In the video, PyGWalker is showcased as a tool that simplifies the process of creating visualizations and conducting exploratory data analysis.

๐Ÿ’กPandas

Pandas is an open-source data analysis and manipulation library for Python. It provides data structures such as dataframes, which are akin to tables or spreadsheets, and are widely used for handling and processing datasets. In the context of the video, Pandas dataframes serve as the input for PyGWalker, allowing users to visualize and interact with their data in a more user-friendly manner.

๐Ÿ’กJupyter Notebook

Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations, and narrative text. It is widely used in data science and machine learning for prototyping, data cleaning, and visualization. In the video, the presenter demonstrates how to integrate PyGWalker into a Jupyter Notebook to create interactive data visualizations directly within the notebook environment.

๐Ÿ’กStreamlit

Streamlit is an open-source Python library used for creating custom web apps in a fast and easy manner. It is particularly useful for data scientists and engineers who want to turn their data scripts into shareable web applications without the need for extensive web development skills. In the video, Streamlit is shown as a platform where the PyGWalker dashboard can be embedded, allowing for interactive data exploration within a web app.

๐Ÿ’กDataframe

A dataframe is a two-dimensional, table-like data structure in Python, provided by the Pandas library. It is a core component of data analysis in Python, allowing for the organization and manipulation of data. In the video, the 'tips' dataset is loaded as a dataframe, which is then visualized and analyzed using PyGWalker.

๐Ÿ’กVisualization

Visualization refers to the process of representing data graphically to make it easier to understand and analyze. In the context of the video, PyGWalker facilitates the creation of various types of visualizations, such as bar charts and line charts, by allowing users to drag and drop fields from the dataframe onto the canvas.

๐Ÿ’กDark Mode

Dark Mode is a user interface mode that uses a dark color scheme, which can reduce eye strain and improve readability in low-light conditions. In the video, the presenter has the option to choose between a light and dark mode for the PyGWalker interface, showcasing the flexibility in user experience customization.

๐Ÿ’กExport

In the context of data visualization and analysis, exporting refers to the ability to save the created visualizations or the analysis environment in a format that can be shared or used outside of the original software. The video highlights the ability to export PyGWalker visualizations as scalable vector graphics and to save the dashboard configuration as a file.

๐Ÿ’กFilter

A filter in data analysis is a mechanism used to display a subset of the data based on specific criteria. In the video, filters are applied within PyGWalker to narrow down the data visualization to a particular group, such as showing data only for male customers.

๐Ÿ’กExploratory Data Analysis (EDA)

Exploratory Data Analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods. The goal of EDA is to identify patterns, anomalies, relationships, and dependencies in the data. In the video, PyGWalker is presented as a tool that simplifies EDA by allowing users to interactively explore and understand their data through visualizations.

๐Ÿ’กConfig File

A config file is a configuration file used to store settings for a software application. It typically contains parameters and preferences that dictate how the application should behave. In the video, a 'config.json' file is used to save the settings of a PyGWalker dashboard, allowing those settings to be reused or shared with others.

Highlights

Introduction of a new Python library called PyGWalker for data visualization.

PyGWalker allows conversion of Pandas dataframes into Tableau-style User Interfaces.

The library can be integrated into Jupyter Notebook or Streamlit apps.

Installation is straightforward via pip: 'pip install pygwalker'.

After installation, PyGWalker can be imported along with pandas.

The 'tips' dataset is used as an example to demonstrate the library's functionality.

PyGWalker automatically labels numeric values as measures and non-numeric as dimensions.

Users can drag and drop fields into x and y axes for data visualization.

The library supports multiple chart types, including bar charts and line charts.

Additional fields can be explored by dragging them into the y-axis for further analysis.

Filters can be applied by dragging specific columns into the filter field.

Charts can be exported in various formats, such as scalable vector graphics.

PyGWalker allows for the addition of multiple charts in a Jupyter Notebook.

The library's functionality can be integrated into a Streamlit app for interactive data analysis.

Config files can be used to maintain settings and dashboard specifications.

PyGWalker's code can be exported and reused for consistent dashboard generation.

The video includes a demonstration of creating a Streamlit app with PyGWalker integrated.

The Streamlit app allows for direct interaction with the data visualization within Streamlit.

The presenter plans to upload the Streamlit app and Jupyter Notebook to GitHub for viewers.

PyGWalker is highlighted as a valuable tool for quick exploratory data analysis.

Transcripts

00:00

Hey guys, There's a new Python library called PyGWalker,

00:03

which lets you turn your Pandas dataframe into a Tableau-style User Interface for visual

00:08

analysis.

00:09

You can integrate it into your Jupyter Notebook or Streamlit app.

00:13

Let me show you how to use it.

00:15

First things first, you need to install it.

00:17

Just run "pip install pygwalker" in your command prompt or terminal, and you're all set.

00:22

Once that's done, you can import it together with pandas.

00:26

Next, for demo purposes, let's load the 'tips' dataset, which looks like this.

00:31

Here, we can see the total bill amount, along with the tips, gender, whether the customer

00:36

was a smoker or not, the day and time of the visit, and the size.

00:40

Alright, now let's assume you want to explore this dataset.

00:44

Instead of plotting it in different ways, you can simply call "pyg.walk" and pass in

00:49

your dataframe.

00:50

You could also specify if you want to use a dark or light mode.

00:54

After executing this line, you'll get this canvas with two tabs.

00:58

Let me first switch to the data tab.

01:00

Here, you can inspect your dataframe and check out some entries if you'd like.

01:04

You'll also notice that PyGWalker automatically labels the numeric values as measures and

01:10

the rest as dimensions.

01:12

Now, let's switch back to the visualization tab.

01:16

Like in Tableau, you can now drag and drop the available fields into the x and y axes.

01:21

So, for example, let me plot the total bill amount by day.

01:26

To resize the graph, you can click here and choose the fixed layout mode.

01:32

Now you can resize the chart.

01:35

By default, PyGWalker picks a chart type for you.

01:38

However, you can use other chart types, like a line chart.

01:42

But for our analysis, I'll switch back to the bar chart.

01:46

The cool thing is, you can drag additional fields into your chart to explore your data

01:51

in a very intuitive way.

01:53

So, for example, I'll add the tips to the y-axis.

01:57

And instead of the total sum, I might want to look at the average bill amount.

02:03

If you want to remove a field, you can simply drag it back to the field list.

02:08

Likewise, you can add multiple fields to the x-axis.

02:12

Let's say, I'm interested to see if smokers spend more money, and I want to see it for

02:17

each day.

02:18

Additionally, I also want the information if it was during dinner or lunchtime.

02:23

To do that, you can use the colour field.

02:26

And I'm only interested in this result for male customers.

02:30

So you can apply a filter by dragging the column into the filter field.

02:34

Now, I'll select only 'male' and confirm my selection.

02:38

The values will then be filtered accordingly.

02:42

And to remove it, you can either select all again or simply drag it out of the filter

02:48

field.

02:49

I think it's a very interesting tool for quick exploratory data analysis.

02:53

However, you can also export the charts.

02:56

So, for example, I'll export it to a scalable vector graphic, and this is what it looks

03:02

like.

03:04

Alright, now back in my Jupyter Notebook, you can add additional charts by clicking

03:09

up here.

03:10

And now you have a second canvas, where you can again explore your data.

03:14

There are actually a couple more options in PyGWalker, but I would encourage you to play

03:19

around with it yourself.

03:21

But one important feature is to export your code.

03:24

So, if you click the "Export As" button, you'll have different options.

03:28

First, let me export it as "code".

03:31

Next, I'll hit the 'Copy to Clipboard' button.

03:34

Now, if I paste it into a new cell, you'll see the spec of my dashboard.

03:40

If I run "pyg.walk" and specify my specs, we'll get back the same dashboard.

03:45

Alternatively, you could also export your settings as a file.

03:49

If you do so, you'll get a "config.json".

03:53

Let me grab this config file from my downloads folder and paste it into my current directory.

03:59

Now, if you want, you can also implement the PyGWalker dashboard into your Streamlit app.

04:05

To demonstrate this, I've created the following app.

04:08

After importing Streamlit, I'll set some basic page configurations.

04:12

Next, I'll load our 'tips' dataset and set up a title and subheader.

04:18

The interesting part is here.

04:20

We can now read the content from the config file and use "pyg.walk" again in our Streamlit

04:26

app, together with the previous settings.

04:29

All you need to do is specify "Streamlit" in the environment.

04:32

This time, I'll also switch it to the dark mode.

04:35

Ok, and with that in place, let's spin up my app by running "streamlit run app.py".

04:41

Once my page is loaded, we'll have the PyGWalker canvas in our Streamlit app.

04:46

And as before, you can now play around with your data directly in Streamlit.

04:51

Ok, guys, and that's all I have for you today.

04:54

I will upload this Streamlit app and also the Jupyter Notebook to my GitHub repo.

04:58

You will find the link in the description box of this video.

05:02

As always, thanks for watching, and I'll see you in the next video!

Rate This
โ˜…
โ˜…
โ˜…
โ˜…
โ˜…

5.0 / 5 (0 votes)

Related Tags
DataAnalysisPythonLibraryInteractiveUIPandasDataFrameTableauStyleJupyterIntegrationStreamlitAppDataVisualizationQuickEDAExportOptions