dataprocessing
Class DataSource

java.lang.Object
  extended bydataprocessing.DataSource

public class DataSource
extends java.lang.Object

General purpose datafile interface. Handles all data manipulations.


Field Summary
 java.lang.String filename
           
 int numInputs
           
 int trainingSize
           
 
Constructor Summary
DataSource(java.lang.String filename)
          Constructor for a DataSource object.
 
Method Summary
 void addTargetNoise(double prob, java.util.ArrayList data)
          Adds noise to the target of the supplied ArrayList with probability prob.
 java.util.ArrayList getData()
          Returns an ArrayList of Example objects, containing all data in this source.
 java.util.ArrayList getTestingData()
          Returns an ArrayList of Example objects to use as testing data.
 java.util.ArrayList getTrainingData()
          Returns an ArrayList of Example objects to use as training data.
 int numExamples()
          Returns the total number of Examples available in this DataSource.
 void printData()
          Prints out every Example in this DataSource.
 void shuffle()
          Randomly shuffles this DataSource.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

filename

public java.lang.String filename

numInputs

public int numInputs

trainingSize

public int trainingSize
Constructor Detail

DataSource

public DataSource(java.lang.String filename)
Constructor for a DataSource object. Opens the specified file and reads in all data, which must be in CSV (comma-separated-values) format, with one example per line. The last value on each line is treated as the target variable.

Parameters:
filename - A string specifying a file from which to load some data.
Method Detail

addTargetNoise

public void addTargetNoise(double prob,
                           java.util.ArrayList data)
Adds noise to the target of the supplied ArrayList with probability prob.

Parameters:
prob - The probability of class noise being added to each Example.
data - An ArrayList of Examples to apply the noise to.

getData

public java.util.ArrayList getData()
Returns an ArrayList of Example objects, containing all data in this source.

Returns:
The number of Examples.

getTestingData

public java.util.ArrayList getTestingData()
Returns an ArrayList of Example objects to use as testing data. If trainingSize is not set, this method returns the last 50% of the data.

Returns:
An ArrayList of Examples.

getTrainingData

public java.util.ArrayList getTrainingData()
Returns an ArrayList of Example objects to use as training data. If trainingSize is not set, this method returns the first 50% of the data.

Returns:
An ArrayList of Examples.

numExamples

public int numExamples()
Returns the total number of Examples available in this DataSource.

Returns:
The number of Examples.

printData

public void printData()
Prints out every Example in this DataSource.


shuffle

public void shuffle()
Randomly shuffles this DataSource. After a call to this method, all data in this source will be in random order.

Returns:
Nothing.