Java实现词频统计

   日期:2020-10-15     浏览:142    评论:0    
核心提示:一、随便找一篇英文文章,存储在txt文本中,内容如下:There are moments in life when you miss someone so much that you just want to pick them from your dreams and hug them for real! Dream what you want to dream;go where you want to go;be what you want to be,because you have on

一、随便找一篇英文文章,存储在txt文本中,内容如下:

There are moments in life when you miss someone so much that you just want to pick them from your dreams and hug them for real! Dream what you want to dream;go where you want to go;be what you want to be,because you have only one life and one chance to do all the things you want to do.

  May you have enough happiness to make you sweet,enough trials to make you strong,enough sorrow to keep you human,enough hope to make you happy? Always put yourself in others’shoes.If you feel that it hurts you,it probably hurts the other person, too.

  The happiest of people don’t necessarily have the best of everything;they just make the most of everything that comes along their way.Happiness lies for those who cry,those who hurt, those who have searched,and those who have tried,for only they can appreciate the importance of people

  who have touched their lives.Love begins with a smile,grows with a kiss and ends with a tear.The brightest future will always be based on a forgotten past, you can’t go on well in lifeuntil you let go of your past failures and heartaches.

  When you were born,you were crying and everyone around you was smiling.Live your life so that when you die,you're the one who is smiling and everyone around you is crying.

  Please send this message to those people who mean something to you,to those who have touched your life in one way or another,to those who make you smile when you really need it,to those that make you see the brighter side of things when you are really down,to those who you want to let them know that you appreciate their friendship.And if you don’t, don’t worry,nothing bad will happen to you,you will just miss out on the opportunity to brighten someone’s day with this message.

二、编写WordCount类实现词频统计

package hdfs;

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.Map;
import java.util.TreeMap;
import java.util.Set;
import java.util.TreeSet;
import java.util.Arrays;


public class WordCount { 
    
    public static String getTxtString(String path) { 
        StringBuilder sBuilder = new StringBuilder();
        BufferedReader br = null;
        try { 
            br = new BufferedReader(new FileReader(path));
            char[] buffer = new char[512];
            int len;
            while ((len = br.read(buffer)) != -1) { 
                sBuilder.append(new String(buffer, 0, len));
            }
        } catch (Exception e) { 
            e.printStackTrace();
        } finally { 
            try { 
                if (br != null) br.close();
            } catch (Exception e) { 
                e.printStackTrace();
            }
        }
        return sBuilder.toString();
    }

    
    @SuppressWarnings("all")
    public static Map<String, Integer> getCounter(String txtString) { 
        Map<String, Integer> treeMap = new TreeMap<>();
        Set<String> treeSet = new TreeSet<>();
        String[] strArr = txtString.split("[\\s?',!;.\u3000]+");
        Arrays.asList(strArr).forEach(treeSet::add);
        for (String s : treeSet) { 
            int counter = 0;
            for (String s1 : strArr) { 
                if (s.equals(s1)) counter++;
            }
            treeMap.put(s, counter);
        }
        return treeMap;
    }
}

三、创建TestWordCount类进行测试

package hdfs;

import java.util.Map;

public class TestWordCount { 
    public static void main(String[] args) { 
        String txtString = WordCount.getTxtString("src/main/java/hdfs/article.txt");
        Map<String, Integer> treeMap = WordCount.getCounter(txtString);
        treeMap.forEach((key, value) -> System.out.println(key + ": " + value));
    }
}

运行结果:

Always: 1
And: 1
Dream: 1
Happiness: 1
If: 1
Live: 1
Love: 1
May: 1
Please: 1
The: 2
There: 1
When: 1
a: 4
all: 1
along: 1
always: 1
and: 7
another: 1
appreciate: 2
are: 2
around: 2
bad: 1
based: 1
be: 3
because: 1
begins: 1
best: 1
born: 1
brighten: 1
brighter: 1
brightest: 1
can: 1
can’t: 1
chance: 1
comes: 1
cry: 1
crying: 2
day: 1
die: 1
do: 2
don’t: 3
down: 1
dream: 1
dreams: 1
ends: 1
enough: 4
everyone: 2
everything: 2
failures: 1
feel: 1
for: 3
forgotten: 1
friendship: 1
from: 1
future: 1
go: 4
grows: 1
happen: 1
happiest: 1
happiness: 1
happy: 1
have: 7
heartaches: 1
hope: 1
hug: 1
human: 1
hurt: 1
hurts: 2
if: 1
importance: 1
in: 4
is: 2
it: 3
just: 3
keep: 1
kiss: 1
know: 1
let: 2
lies: 1
life: 4
lifeuntil: 1
lives: 1
make: 6
mean: 1
message: 2
miss: 2
moments: 1
most: 1
much: 1
necessarily: 1
need: 1
nothing: 1
of: 6
on: 3
one: 4
only: 2
opportunity: 1
or: 1
other: 1
others’shoes: 1
out: 1
past: 2
people: 3
person: 1
pick: 1
probably: 1
put: 1
re: 1
real: 1
really: 2
searched: 1
see: 1
send: 1
side: 1
smile: 2
smiling: 2
so: 2
someone: 1
someone’s: 1
something: 1
sorrow: 1
strong: 1
sweet: 1
tear: 1
that: 6
the: 8
their: 3
them: 3
they: 2
things: 2
this: 2
those: 9
to: 19
too: 1
touched: 2
trials: 1
tried: 1
want: 6
was: 1
way: 2
well: 1
were: 2
what: 2
when: 4
where: 1
who: 10
will: 3
with: 4
worry: 1
you: 32
your: 4
yourself: 1
 
打赏
 本文转载自:网络 
所有权利归属于原作者,如文章来源标示错误或侵犯了您的权利请联系微信13520258486
更多>最近资讯中心
更多>最新资讯中心
0相关评论

推荐图文
推荐资讯中心
点击排行
最新信息
新手指南
采购商服务
供应商服务
交易安全
关注我们
手机网站:
新浪微博:
微信关注:

13520258486

周一至周五 9:00-18:00
(其他时间联系在线客服)

24小时在线客服