Databases and Datacubes
- Where do you get the data from?
- You can just have a set of text files, and lots of textmining
applications work from this.
- You can also just have a simple dataset in a flat file or from
an excel spreadsheet. I do a lot of my analysis in excel.
- You can also take advantage of database technology and deal
with really complex data sets.
- You can even go further and take operational databases and
convert them to specialised databases for datamining.
- This can involve alignment to make mining more efficient.
- It can also involve datacleansing; this includes fixing bad
fields, and combining multiple entries.
- This is a topic for a whole lecture itself.