A-A+

用Toxy轻松实现把Excel转DataSet

2014年10月14日 开源力量 用Toxy轻松实现把Excel转DataSet已关闭评论 阅读 961 次

What's Toxy
Toxy is a .NET data/text extraction framework similar to Apache Tika in Java. It supports a lot of popular formats such as docx, xlsx, xls, pdf, csv, txt, epub, html and so on.

Why Toxy
In the past, we have to use IFilter to extract texts from other documents. But Toxy is platform independent. It will try to support not only Windows but also Linux (with Mono installed). The usage of Toxy will be very easy. You don't need to care much about what extension you are extracting because it is a clever framework to help identify the file formats and extract the data/text into some unified structures.

Unified Data Structures
For documents, the data structure is called ToxyDocument.
For spreadsheets, the data structure is called ToxySpreadsheet.
For emails, the data structure is called ToxyEmail.
For business cards, the data structure is called ToxyBusinessCard.
For DOM based structue, the data structue is called ToxyDom.

Toxy on SNS
QQ Group:297128022

Latest Source Code
Github: https://github.com/tonyqus/toxy
Codeplex: https://git01.codeplex.com/toxy (synced with github periodically)

作者:Tony Qu

官方网站:toxy.codeplex.com
Toxy是继NPOI之后我主推的另一个项目,主要目的是为了解决文档的抽取问题,其支持的格式包括所有docx、xlsx、xls、csv、vcard等。
下面是一个简单但很有用的例子
            ParserContext c=new ParserContext(@"c:\employee.xls");
            var parser=ParserFactory.CreateSpreadsheet(c);
            var spreadsheet= parser.Parse();
            DataSet ds = spreadsheet.ToDataSet();
这里就是传说中可以直接把Excel Workbook转换成DataSet的代码,神奇吧!
这里CreateSpreadsheet支持xls和xlsx,所以你无需担心这方面的问题。
除了Workbook转DataSet外,Toxy也支持把Excel里面的某个表转成DataTable,因为ToxyTable有一个叫ToDataTable的方法,而一个ToxySpreadsheet(相当于Excel workbook)可以包含多个ToxyTable。
有了Toxy,抽取Excel数据的工作变得更加简单了!
后记:还没有动手去试试这个牛X的Toxy组件,目前还是直接使用NPOI
标签:

评论已关闭!

Copyright © 极品飞鸽 保留所有权利.   Theme  Ality 蜀ICP备14015766号-1

用户登录

分享到: