Counting occurrences in first column of a file
--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Riding Sky Waves v001
--
Chapters
00:00 Counting Occurrences In First Column Of A File
00:20 Accepted Answer Score 13
01:15 Thank you
--
Full question
https://superuser.com/questions/521891/c...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#linux #bash #perl #awk
#avk47
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Riding Sky Waves v001
--
Chapters
00:00 Counting Occurrences In First Column Of A File
00:20 Accepted Answer Score 13
01:15 Thank you
--
Full question
https://superuser.com/questions/521891/c...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#linux #bash #perl #awk
#avk47
ACCEPTED ANSWER
Score 13
If the input is sorted, you can use uniq:
<infile cut -d' ' -f1 | uniq -c
If not, sort it first:
<infile cut -d' ' -f1 | sort -n | uniq -c
Output:
3 1
1 3
2 52
The output is swapped compared to your requirement, you can use awk '{ print $2, $1 }'
to change that.
1 3
3 1
52 2
There's also the awk idiom, which does not require sorted input:
awk '{h[$1]++}; END { for(k in h) print k, h[k] }'
Output:
1 3
52 2
3 1
As the output here comes from a hash it will not be ordered, pass to sort -n
if that is needed:
awk '{h[$1]++} END { for(k in h) print k, h[k] }' | sort -n
If you're using GNU awk, you can do the sorting from within awk:
awk '{h[$1]++} END { n = asorti(h, d, "@ind_num_asc"); for(i=1; i<=n; i++) print d[i], h[d[i]] }'
In the last two cases the output is:
1 3
3 1
52 2