Post Process of Data after Import with Data Connector

Data is protected after using data connector to import data, so user can’t remove or modify data after import. Worksheet After Import Script provides an option to post processing of imported data. The script will be executed before the imported data is locked, so script can do some data cleanup and then data is locked. The worksheet script is embedded in worksheet so user can save worksheet/book as template for future use.

In this blog, I am going to show several examples.  Please first download the zip file with the data files used in the blog.

Example 1:  Simply delete last two rows of data after import

Start a new worksheet. Drag  and drop ex1.dat into the  worksheet. The data connector icon is added at the upper left corner of the worksheet and the imported data has eight rows. Try to highlight the last 2 rows and delete them. It isn’t allowed since the data is protected.

Let’s start a new worksheet. Choose Worksheet -> Worksheet Script… menu. You can see it opens the Script tab of the Worksheet Properties dialog. Put the following script in the Script box. Check After Import box so the script will get executed after data import. Click OK.

Choose File: Save Template as… menu. Save the template with name: Remove_2_rows. Choose File: Recent Books: Remove_2_rows menu to load it.

Drag and drop ex1.dat into this workbook. There are only 6 rows of data. The last two rows are removed.

You can share this template with others.

Example 2: Combine duplicated data rows after import

Create a new worksheet, drag and drop ex2.csv into worksheet. The Time column has duplicates. Again because the data is protected, we can’t combine duplicated rows by averaged values in the worksheet.

Again we can use worksheet script to do so. Create a new workbook. Select Format: Worksheet… menu . Go to Script tab. Check After Import box. Add the following script this time.

X-function wdeldup is used in this case. It takes Col(1) as reference column and combines duplicated rows with averaged value.

Click OK and now drag ex2.csv into the worksheet. The first column now only contains unique time values. Values in column B, C and D are averaged. Data is still protected. Any time the data is modified and you click the connector lock to import again, the script will be run again to remove duplicate rows by averaging.

Example 3: Keep the top 10 records after import

In this example we will show a more complicated example using Python code. Create a new workbook, drag and drop ex3.dat into the worksheet. The data shows the protein consumption from different foods in some countries. We will run a piece of Python code to only keep the records of top 10 countries that has most protein dependence from Red Meat during import.

Create a new workbook. right click in gray area and choose Add Text… context menu. Enter Python Code shown as below.

Right click on the text object and select Properties…. In the dialog, On Frame tab, Choose Box as Frame. Go to Programming tab, change the Name to btCode. Enter the following Python code:

The code will pass the data in worksheet to a dataframe object in Python. Use nlargest method to update the data frame with 10 measurements of largest ‘Read Meat’ column values. Then it passes the data back to Origin.

Go to menu Worksheet -> Worksheet Script... Enter the script run -pyb btCode.  This script is just to run the Python code saved in the text object with name btCode.

Drag and drop into the worksheet.  now ten rows with largest ‘red meat’ values are imported into the worksheet.

Leave a Reply

Your email address will not be published. Required fields are marked *