语料中的图表公式怎么处理?

greatlion

初级会员
我近来在收集一些语料,主要是工科类论文,不过里面很多的公式,图表,我想问一下这个应该怎么处理比较好呢?还请高人指教,谢谢! 
 
回复: 语料中的图表公式怎么处理?

表是可以处理的,因为毕竟在框架之内还是文字,但图就很难处理了。除非你的检索软件还具有ocr光学识别的功能。
 
回复: 语料中的图表公式怎么处理?

我近来在收集一些语料,主要是工科类论文,不过里面很多的公式,图表,我想问一下这个应该怎么处理比较好呢?还请高人指教,谢谢! 

I don't think there should be a solution to your intended corpus, as a well-accepted one will be a collection of ...TEXTS, rather than TABLES and FIGURES.
 
回复: 语料中的图表公式怎么处理?

那是不是说,工科类论文不适合作为语料呢? 因为绝大部分的工科论文都是由大量图表构成的。
 
回复: 语料中的图表公式怎么处理?

The usual practice in corpus creation is retain textual data while omitting such graphics and tables and replacing them with a "place holder" (e.g. an XML element indicating what has been omitted).
 
回复: 语料中的图表公式怎么处理?

The usual practice in corpus creation is retain textual data while omitting such graphics and tables and replacing them with a "place holder" (e.g. an XML element indicating what has been omitted).

A good solution to F1's issue.
 
回复: 语料中的图表公式怎么处理?

The usual practice in corpus creation is retain textual data while omitting such graphics and tables and replacing them with a "place holder" (e.g. an XML element indicating what has been omitted).

Thanks dr xiao!
 
回复: 语料中的图表公式怎么处理?

恩,跟我导师提供的建议一样。我再考虑一下,着手compilation.
 
Back
顶部