Your Analysis Flow🔮
Last updated
Last updated
Lets walk through most of Bayeslab's important features by doing an analysis on data below
Let's first import this data and see what the data looks like:
So it's near 30k rows with a few columns like demographic information, user classification, purchase statistics.
Now Next.
Bayeslab has suggested Type and Description for this table, and some recommendations for how to make it better for analysis.
Type and Description are pretty important for AI to correctly perform actions. Please make sure it's correct and contain enough details, especially any particulars that is not common knowledge.
For example
the lifecycle column contains A, B,C levels. It would be good to specify what these levels mean and if there're other choices.
The gender column is number, while it's good to specify what each number means.
We know this data is user behavior information from an online service, so we change the table name to user_behavior, to better represent the nature of this data set.
Then Confirm.
First let AI analyze this data and see what we can get:
It mentions the problem of no clear definition of A,B, C stage (ignore for now), some missing values and some potential relationships.
Good to know.
Also there're some interesting tools below:
Let's try automatic explore and see what happens, by clicking Automatic Charting and Run in the new added block.
We can see some automatic analysis and distribution done, with observation of that chart for insights, and some recommendation on how to improve the charts.
Quite some knowledge generated to take in.
Now we'd like to see what else can we get from this data, click Analysis Ideas:
Some interesting ideas with data need/visual and goals explained. We'll definitely try some later.
Let's get our hands dirty for some direct wresting with the data, say we first do some clean/transformation for this data. The experience is pretty much just say it clearly:
You can see the result contains a "Filtered and Normalized DataFrame", clicking it shows the data has been cleaned according to our prompt.
We can continue to massage this data in next block by referencing this result (using // ) and say what we want.
The result(like Final Result DataFrame) here is temporary and will be cleared when you disconnect with the machine.
If you would like to save the result, you can write back the result by saying "write back" or anything similar like below
Let's further analyze by doing visualization, following recommendation from AI in previous steps.
Let's see how age group affects revenue.
You can see the analysis result actually infers a little more about the data using common knowledge , for example 0-20 equals "younger people".
The title and its location does not look good, we can change it by click "Edit" when hovering on this chart.
Although we might not understand machine learning in our heart, but we still would like to do some prediction and what-ifs.
With 1 line, we can build a prediction model to predict revenue from age/gender and past order amounts.
And use another line to predict
Please pay special attention to R-squared value in the output.
R² measures how well the independent variables explain the variance in the dependent variable. It ranges from 0 to 1:
0: The model explains none of the variance.
1: The model explains all of the variance.
So in above sample, the model is actually very very bad (R-squared = 0.069). This means the factors we choose doesn't have clear relationship with revenue, or we need to try other models.
Finally, someone in our team would like to see this analysis and maybe throw a bunch of questions on his own.
Make sure you check the "Include data table" option so your friend would be able to reproduce everything and do more with the data.
For more details, please see .