Post by account_disabled on Mar 12, 2024 4:44:02 GMT
When you run the program, it will be populated into a data frame and displayed as a table as shown above. 4. Prepare data for clustering Data analysis for clustering We will use numpy to cluster the read data (combining multiple pieces of data into a single group). Mr. Nakajima: "Numpy is a mathematical calculation library and is an extremely excellent tool that can perform everything from matrix operations to advanced calculations." I'll try running it When I executed it, a multidimensional matrix was output in which each item was a matrix.
5. cluster the output data using K-means. Mr. Nakajima: ``Don't think about anything difficult, let's specify the number of clusters as 4 and predict the cluster number.'' execution When executed, it will output four cluster Chinese Student Phone Number List numbers. 6. Attach to data frame Finally, we will attach the output data to the data frame. Mr. Nakajima: "The only code you need to enter is this. It's super easy. Pandas data frames are really excellent." execution By assigning the array of cluster numbers obtained earlier as the field cluster_id of the df dataset, you can attach the cluster numbers to the right end without loop processing.
Compared to the initial Excel data, we have obtained organized data. Mr. Nakajima: "Let's use this data frame to extract various "usable" data." Mr. Nakajima: "It is also possible to display the number of data for each cluster, calculate the average value, and display basic statistics for each cluster. Graphs can also be output." Unsupervised learning Q&A At the end, there was time for questions from the audience. Q: Do horse racing prediction systems perform unsupervised learning by referring to past data? A: It is necessary to select data to determine what influences the winning factors, but I think it is commonly done.
5. cluster the output data using K-means. Mr. Nakajima: ``Don't think about anything difficult, let's specify the number of clusters as 4 and predict the cluster number.'' execution When executed, it will output four cluster Chinese Student Phone Number List numbers. 6. Attach to data frame Finally, we will attach the output data to the data frame. Mr. Nakajima: "The only code you need to enter is this. It's super easy. Pandas data frames are really excellent." execution By assigning the array of cluster numbers obtained earlier as the field cluster_id of the df dataset, you can attach the cluster numbers to the right end without loop processing.
Compared to the initial Excel data, we have obtained organized data. Mr. Nakajima: "Let's use this data frame to extract various "usable" data." Mr. Nakajima: "It is also possible to display the number of data for each cluster, calculate the average value, and display basic statistics for each cluster. Graphs can also be output." Unsupervised learning Q&A At the end, there was time for questions from the audience. Q: Do horse racing prediction systems perform unsupervised learning by referring to past data? A: It is necessary to select data to determine what influences the winning factors, but I think it is commonly done.