牛骨文教育服务平台(让学习变的简单)
博文笔记

用PYTHON进行字符串提取的两种方法

创建时间:2016-04-11 投稿人: 浏览次数:1617
一、提取某两个标记之间的文本内容(多行)

有文本内容如下:

12345678fdsjhgjhgfdshkjhkStartGood MorningHello WorldEnddashjkhjkdsfjkhk

我需要用Python实现——获取”Start”和”End”之间的内容并写入结果文件。

解决方法1:

    copy = False
    for line in infile:
        if line.strip() == "Start":
            copy = True
        elif line.strip() == "End":
            copy = False
        elif copy:
            outfile.write(line)
123456789withopen("/path/to/input")asinfile,open("/path/to/output","w")asoutfile:    copy=False    forlineininfile:        ifline.strip()=="Start":            copy=True        elifline.strip()=="End":            copy=False        elifcopy:            outfile.write(line)

解决方法2:

1 2 3 4 5 6 7 with open("input.txt") as myfile:     content = myfile.read()   text = re.search(r"Start .*?End", content, re.DOTALL).group()   with open("output.txt", "w") as myfile2:     myfile2.write(text)

解决方法3:

123456importitertoolswithopen("input.txt","r")asf,open("output.txt","w")asfout:    whileTrue:        it=itertools.dropwhile(lambdaline:line.strip()!="Start",f)        ifnext(it,None)isNone:break        fout.writelines(itertools.takewhile(lambdaline:line.strip()!="End",it))

参考链接:

http://stackoverflow.com/questions/18865058/extract-values-between-two-strings-in-a-text-file-using-python

二、提取某两个字符串之间的内容(单行)
解决方法(字符串切片):

1 2 3 4 5 6 """ get content between str1 and str2 in str """ def getBetween(str, str1, str2):     strOutput = str[str.find(str1)+len(str1):str.find(str2)]     return strOutput

参考链接:

https://github.com/bfishadow/SBB

三、其它的实现方式

1 2 3 4 5 6 7 8 9 10 11 sed-n"/Start/,/End/p"input.txt|grep-Ev"(Start|End)"   sed-e"1,/Start/d"-e"/End/,$d"input.txt   awk/Start/,/End/input.txt|grep-Ev"(Start|End)"   awk"/Start/{flag=1;next} /End/{flag=0} flag{ print }"input.txt   awk"/End/{flag=0} flag; /Start/{flag=1}"input.txt   perl-lne"print if((/Start/../End/) && !(/Start/||/End/))"input.txt

搜索关键字:
  • awk print line between
参考链接:
  • http://www.unix.com/shell-programming-and-scripting/48676-how-print-only-lines-between-two-strings-using-awk.html
  • https://nixtip.wordpress.com/2010/10/12/print-lines-between-two-patterns-the-awk-way/
  • http://www.shellhacks.com/en/Using-SED-and-AWK-to-Print-Lines-Between-Two-Patterns
  • http://stackoverflow.com/questions/17988756/how-to-select-lines-between-two-marker-patterns-which-may-occur-multiple-times-w

=EOF=

AWKPYTHONSEDTIPS

声明: 除非注明,CrazyOf.me文章均为原创,转载请以链接形式标明本文地址,谢谢!
https://crazyof.me/blog/archives/2406.html

声明:该文观点仅代表作者本人,牛骨文系教育信息发布平台,牛骨文仅提供信息存储空间服务。