立即注册找回密码

QQ登录

只需一步,快速开始

微信登录

微信扫一扫,快速登录

手机动态码快速登录

手机号快速注册登录

搜索

图文播报

查看: 410|回复: 0

[分享] 自然语言处理基础技术工具篇之TextBlob

[复制链接]
发表于 2024-11-8 10:58 | 显示全部楼层 |阅读模式

登陆有奖并可浏览互动!

您需要 登录 才可以下载或查看,没有账号?立即注册 微信登录 手机动态码快速登录

×
TextBlob简介

<hr/>TextBlob实战

安装:pip install textblob

配置国内源安装:pip install textblob  -i https://pypi.tuna.tsinghua.edu.cn/simple

参考:https://textblob.readthedocs.io/en/dev/quickstart.html
from textblob import TextBlob
text = &#39;I love natural language processing! I am not like fish!&#39;
blob = TextBlob(text)<hr/>1.词性标注

blob.tags
[(&#39;I&#39;, &#39;PRP&#39;),
(&#39;love&#39;, &#39;VBP&#39;),
(&#39;natural&#39;, &#39;JJ&#39;),
(&#39;language&#39;, &#39;NN&#39;),
(&#39;processing&#39;, &#39;NN&#39;),
(&#39;I&#39;, &#39;PRP&#39;),
(&#39;am&#39;, &#39;VBP&#39;),
(&#39;not&#39;, &#39;RB&#39;),
(&#39;like&#39;, &#39;IN&#39;),
(&#39;fish&#39;, &#39;NN&#39;)]<hr/>2.短语抽取

np = blob.noun_phrases
for w in np:
    print(w)
natural language processing<hr/>3.计算句子情感值

for sentence in blob.sentences:
    print(sentence + &#39;------>&#39; +  str(sentence.sentiment.polarity))
I love natural language processing!------>0.3125
i am not like you!------>0.0<hr/>4.Tokenization(把文本切割成句子或者单词)

token = blob.words
for w in token:
    print(w)
I
love
natural
language
processing
I
am
not
like
fish
sentence = blob.sentences
for s in sentence:
    print(s)
I love natural language processing!
I am not like fish!<hr/>5.词语变形(Words Inflection)

token = blob.words
for w in token:
    # 变复数
    print(w.pluralize())
    # 变单数
    print(w.singularize())
we
I
love
love
naturals
natural
languages
language
processings
processing
we
I
ams
am
nots
not
likes
like
fish
fish<hr/>6.词干化(Words Lemmatization)

from textblob import Word
w = Word(&#39;went&#39;)
print(w.lemmatize(&#39;v&#39;))
w = Word(&#39;octopi&#39;)
print(w.lemmatize())
go
octopus<hr/>7.集成WordNet

from textblob.wordnet import VERB
word = Word(&#39;octopus&#39;)
syn_word = word.synsets
for syn in syn_word:
    print(syn)
Synset(&#39;octopus.n.01&#39;)
Synset(&#39;octopus.n.02&#39;)指定返回的同义词集为动词
syn_word1 = Word(&#34;hack&#34;).get_synsets(pos=VERB)
for syn in syn_word1:
    print(syn)
Synset(&#39;chop.v.05&#39;)
Synset(&#39;hack.v.02&#39;)
Synset(&#39;hack.v.03&#39;)
Synset(&#39;hack.v.04&#39;)
Synset(&#39;hack.v.05&#39;)
Synset(&#39;hack.v.06&#39;)
Synset(&#39;hack.v.07&#39;)
Synset(&#39;hack.v.08&#39;)查看synset(同义词集)的具体定义
Word(&#34;beautiful&#34;).definitions
[&#39;delighting the senses or exciting intellectual or emotional admiration&#39;,
&#39;(of weather) highly enjoyable&#39;]<hr/>8.拼写纠正(Spelling Correction)

sen = &#39;I lvoe naturl language processing!&#39;
sen = TextBlob(sen)
print(sen.correct())
I love nature language processing!Word.spellcheck()返回拼写建议以及置信度
w1 = Word(&#39;good&#39;)
w2 = Word(&#39;god&#39;)
w3 = Word(&#39;gd&#39;)
print(w1.spellcheck())
print(w2.spellcheck())
print(w3.spellcheck())
[(&#39;good&#39;, 1.0)]
[(&#39;god&#39;, 1.0)]
[(&#39;go&#39;, 0.586139896373057), (&#39;god&#39;, 0.23510362694300518), (&#39;d&#39;, 0.11658031088082901), (&#39;g&#39;, 0.03626943005181347), (&#39;ed&#39;, 0.009067357512953367), (&#39;rd&#39;, 0.006476683937823834), (&#39;nd&#39;, 0.0038860103626943004), (&#39;gr&#39;, 0.0025906735751295338), (&#39;sd&#39;, 0.0006476683937823834), (&#39;md&#39;, 0.0006476683937823834), (&#39;id&#39;, 0.0006476683937823834), (&#39;gdp&#39;, 0.0006476683937823834), (&#39;ga&#39;, 0.0006476683937823834), (&#39;ad&#39;, 0.0006476683937823834)]<hr/>9.句法分析(Parsing)

text = TextBlob(&#39;I lvoe naturl language processing!&#39;)
print(text.parse())
I/PRP/B-NP/O lvoe/NN/I-NP/O naturl/NN/I-NP/O language/NN/I-NP/O processing/NN/I-NP/O !/./O/O<hr/>10.N-Grams

text = TextBlob(&#39;I lvoe naturl language processing!&#39;)
print(text.ngrams(n=2))
[WordList([&#39;I&#39;, &#39;lvoe&#39;]), WordList([&#39;lvoe&#39;, &#39;naturl&#39;]), WordList([&#39;naturl&#39;, &#39;language&#39;]), WordList([&#39;language&#39;, &#39;processing&#39;])]

另外,代码我已经上传github:https://github.com/yuquanle/StudyForNLP/blob/master/NLPtools/TextBlobDemo.ipynb

知乎专栏:知乎用户
公众号:StudyForAI(小白人工智能入门学习)

原文地址:https://zhuanlan.zhihu.com/p/51865496
楼主热帖
回复

使用道具 举报

发表回复

您需要登录后才可以回帖 登录 | 立即注册 微信登录 手机动态码快速登录

本版积分规则

关闭

官方推荐 上一条 /3 下一条

快速回复 返回列表 客服中心 搜索 官方QQ群 洽谈合作
快速回复返回顶部 返回列表