怎样消除段首半角空格缩进?

本文由 fuyanger2017-01-01 发表於 "语料库标注" 讨论区

Tags:
  1. fuyanger

    fuyanger 普通会员

    Is there anyone could tell me how to make the first pattern become the second pattern ?

    The first pattern:
    Report on the Work of the Government (2007)
    The following is the full text of the Report on the Work of the Government delivered by Premier Wen Jiabao at the Fifth Session of the Tenth National People's Congress on March 5, 2007:
    REPORT ON THE WORK OF THE GOVERNMENT
    Delivered at the Fifth Session of the
    Tenth National People's Congress on March 5, 2007
    Wen Jiabao
    Premier of the State Council
    Fellow Deputies,
    The second pattern:
    REPORT ON THE WORK OF THE GOVERNMENT
    Delivered at the Second Session of the Eleventh National People's Congress on March 5, 2009
    Wen Jiabao
    Premier of the State Council
    Fellow Deputies,
    On behalf of the State Council, I now present to you my report on the work of the government for yourdeliberation and approval. I also solicit comments and suggestions on the report from the members of the National Committee of the Chinese People's Political Consultative Conference (CPPCC).
    I. Review of the Work in 2008
    The year 2008 was truly eventful. Our country's economic and social development withstood severe challenges and tests that were rarely seen before. Under the leadership of the Communist Party of China (CPC), the people of all our ethnic groups faced difficulties squarely, worked with courage and determination, surmounted all difficulties and obstacles, and made new achievements in reform, opening up and socialist modernization.
     
  2. Please describe your problem more precisely. Your title doesn't match your message body.
     
  3. fuyanger

    fuyanger 普通会员

    My question is how to make the pattern in the first pic into the pattern in the sencond pic.:) 7.png 9.png
     
  4. Windows Notepad isn't a good tool for such purposes. Consider Notepad++ or other similar powerful text editors instead, which have built-in support for regular expressions.
    To remove leading spaces, replace call instances of ^\s+ with nothing.
    However, it isn't so easy to to connect two lines (paragraphs) as you wish. The software doesn't know which lines should be connected. You could do this manually. After all, government reports are not very long.
     
    Last edited: 2017-01-03
  5. fuyanger

    fuyanger 普通会员

    Manually, it would be time-comsuming, for ten years of government report await me~ Anyway, I will do it! Also, thanks for ur advice~