By | 31.10.2023

Free and open source with all your data analysis tools. Create data science solutions with the visual workflow builder, & put them into production in the. KNIME, the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining "Building Blocks of. KNIME the Konstanz Information Miner, is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for.

|We will now look into how to configure these nodes knime meet up the desired functionality. Please note that we will knime only those nodes that are relevant to knime in the knime context of exploring the workflow, knime, knime. It tells that knime node reads the adult data set, knime.

The name of the file is adult. The File Reader knime two outputs knime one goes to Knime Manager node and the other one goes to Statistics node, knime. The Execute menu runs the node. Note that if the node has already knime run and if it is in a green state, knime, this menu is disabled, knime.


Also, knime, note the presence of Edit Note Description menu knime. This allows you to write the description for your node, knime. Now, select knime Configure knime option, it shows the screen containing the data from the adult, knime.

The entire data loading program code is hidden from knime user, knime. You can now appreciate the usefulness of such nodes - no coding required. Our next node is the Color Manager, knime. Color Manager Select the Color Manager node and go into its configuration by right clicking on it. A colors settings dialog would appear. Knime the income column from the dropdown list, knime. If knime income is less than 50K, the datapoint will acquire green color and if it is more it gets red color, knime.

You will see the data point mappings when we look at the scatter plot later in this chapter, knime. Partitioning In machine learning, we usually split the entire available data in knime parts, knime. The larger part is used in training the model, knime, while the smaller portion is used for testing, knime. There are different strategies used for partitioning the data.

To define the desired partitioning, knime, right click on the Partitioning node and knime the Configure option. While doing the split, the data points knime picked up randomly, knime. This ensures that knime test data may not be biased, knime. If you are sure that knime data collection, knime, the randomness is guaranteed, then knime may select the linear sampling. Once your data is ready knime training the model, feed knime to the next node, which is the Decision Tree Learner, knime.

Thus the tree would be built based on the income column and that is what we are trying to achieve in this model, knime. We want a separation of people knime income greater or lesser than 50K, knime. After this node runs successfully, knime, your model would be ready for testing. Decision Tree Predictor The Decision Tree Predictor node applies the knime model to the test data set and appends knime model predictions.

The output of the predictor is fed to two different nodes - Scorer knime Scatter Plot, knime. Next, knime will examine the knime of prediction, knime. Scorer Knime node generates the confusion matrix, knime. To knime it, right click on the node.

If you are not satisfied with knime, you may play around with knime parameters in model knime, especially, you may like to revisit and cleanse your data. Scatter Knime To see the scatter plot of the data distribution, right click on the Scatter Plot node and select the menu option Interactive Knime Scatter Plot.

These knime the colors set in our Color Manager node. The distribution is relative to the age as plotted on the x-axis. Knime may select a different feature for x-axis by changing knime configuration of the node. The configuration knime is shown here knime we have selected the knime as knime feature for x-axis.

We suggest you to take up the other two nodes Statistics and Interactive Table in the model for your self-study, knime. Let us now move on to the most knime part of the tutorial — creating your own model. The dataset contains three different classes of plants.

We will train our model to classify an unknown plant into one of these three classes. On the next screen, knime, knime will be asked for the desired name for the workflow and the destination folder for saving it, knime. Enter this information as knime and click Finish to create a new workspace, knime.

Before, you add nodes, you have to download and prepare the iris dataset for our use. The downloaded iris. We will make some changes in it to add the column names. Open the downloaded file in your favorite text knime and add knime following line at the beginning, knime. Now, you will start adding various nodes. Alternatively, knime, you may knime drag-n-drop feature to knime the node into the workspace. After knime node is added, you will have to configure knime. Right click on the knime and select the Configure menu option.

You have done this in the earlier lesson, knime. The settings screen looks like the following after the datafile is loaded. To load your dataset, knime, click on the Browse button and select the location of your iris. The node will load the contents of the file which are displayed in the lower portion of the configuration box. Once you are satisfied that the datafile is knime properly and loaded, click on the OK button to close the configuration dialog.

You will knime add some annotation to this node. Right click on the node and select New Workflow Annotation menu option.

Resize and place the box knime the knime as desired. Next, make the connection between the two nodes. To do so, click on the output of the File Reader node, knime, keep the mouse knime clicked, knime, a rubber band line would appear, knime, drag it to the input of Partitioning node, release the mouse button, knime.

A connection is now established between the two nodes, knime. Add the annotation, change the description, position the node and annotation view as desired. Adding k-Means Node Select the k-Means node from the repository and add it to the workspace, knime. If you want to refresh your knowledge on k-Means algorithm, knime, just look knime its description in the description view of knime workbench.

Knime the configuration dialog for the node. It takes two inputs - the prototype model and the datatable containing the input data. Just accept the defaults. Now, add some annotation and description to this node, knime. Rearrange your nodes, knime.


We need to visualize the output graphically. For this, we will add a scatter plot. We will set the colors and shapes for three classes knime in the scatter plot. Thus, knime, we will filter the output of the k-Means node first through the Color Manager knime and then through Shape Knime node.

Add it to the workspace, knime, knime. Leave the configuration to its defaults, knime, knime.

Knime that you must open the configuration dialog and hit OK to accept the defaults, knime, knime. Set the description text for the node, knime, knime, knime. Make a connection from the output of k-Means to the input of Color Knime. Leave its knime to the defaults.

Like the previous one, you must open the configuration dialog and hit Knime to set defaults. Establish the connection knime the output of Color Manager to the input of Shape Manager. Set the description for the node. Connect the output of Shape Manager to the input of Scatter Plot.

Leave knime configuration to defaults. Set the description. Finally, add a group annotation to the recently added three nodes Annotation: Visualization Reposition the nodes as desired. Your screen should look like the following knime this knime. This completes the task of model building, knime. If not, knime, you knime need to knime up the Console view for the errors, knime, fix them up and re-run the workflow, knime.

Knime, you are ready to visualize the predicted output of the model. To do so, click on the settings menu at the top right corner of the scatter plot.

This completes our knime of model building, knime. KNIME provides several predefined workflows for your learning, knime, knime.


KNIME provides several pre-programmed nodes for reading data in knime formats, analyzing data using several ML algorithms, and finally knime data in many different ways, knime. Towards the end knime the tutorial, knime, you created your own model starting from scratch.❷