You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

To be able to work with data in a ML Project, they first need to be loaded. Data can be loaded from a CSV file, XLS file, database, data set or a temporary table. There two ways how to read data in BellaDati ML Studio - row by row or as an stream. 

Reading Row by row

 

Reading CSV File

Function readCSVFile() is used for loading data from a CSV file. The function is defined like this:

readCSVFile(String file, String separator, String escape, int limit, Closure<Object> closure)

Parameters

Parameters file and separator are mandatory, parameters escape, limit and closure are optional.

  • file - defines the name of file which should be read. This file needs to be uploaded to the project.
  • separator - defines the separator between values. Can be comma, semicolon etc.
  • escape - defines character which is used of escaping of text.
  • limit - defines the limit of rows which will be loaded.
  • closure

Sample usage

def rows = 0
readCSVFile('file.csv', ',', '', 10) {
  rows++
  println index
  println values[1]
}
println rows

This code will print the row index and value of second column for first 10 rows of the file to the console. After finishing the loop it will display the total number of iterations, in this case 10.

Reading XLS File

 

Reading From SQL Database

Function readSQL is used for loading data from an SQL database. This function uses SQL connections which were previously defined in BellaDati. See Data Sources for more information.

The function is defined like this:

readSQL(Long id, String sql, int limit, Closure<Object> closure)

Parameters

Parameters id and sql are mandatory, parameters limit and closure are optional.

  • id - defines the id of the data source. It can be set by the Code builder.
  • sql - defines the sql query.
  • limit - defines the limit of rows which will be loaded.
  • closure

Sample usage

readSQL(1, 'select * from customers', 10) {
  println values[0]
  rows++
  println columns[0]
	}
println rows

This code will use database connection with ID 1 and it will load all columns for 10 rows from table customers.

Reading from Data Set

Function readDataset() is used for loading data from a data set. The function is defined like this:

readDataset(Integer id, int limit, Closure<Object> closure)

Parameters

Parameters id and closure are mandatory, parameter limit is optional.

  • id - defines the id of the data set. It can be set by the Code builder or it can be found in the URL of the data set.
  • limit - defines the limit of rows which will be loaded.
  • closure - 

Sample usage

Reading Table

Function readTable() can be used for loading data from a temporary table which was previously stored in the project. The table is available for current session only. The function is defined like this:

readTable(String id, Closure<Object> closure)

Parameters

Parameters id is mandatory, parameter closure is optional.

  • id - defines the id (name) of the table. It is set when creating the table.
  • closure - 

Sample usage

readTable('table') {
  println values[0]
  }

This code will print value of first column for each row of the table to the console.

Reading as an Stream

readtable - načtení dočasné tabulky - plátná v rámci session

row by row 

  • cyklus - pro každá řádek se vykoná obsah cyklu
  • v každém cyklu se nastaví řádek, hodnoty, názvy sloupů a indexy

 

stream

-neiteruje se

  • použití např s python skripty

 

 

 

Sample usage

  • No labels