Is there a way to quickly identify files with Windows or Unix line termination?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Quirky Dreamscape Looping
--
Chapters
00:00 Is There A Way To Quickly Identify Files With Windows Or Unix Line Termination?
00:16 Accepted Answer Score 12
00:57 Answer 2 Score 2
01:27 Answer 3 Score 0
02:11 Answer 4 Score 0
02:52 Thank you
--
Full question
https://superuser.com/questions/460169/i...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#linux #windows #commandline #newlines #crlf
#avk47
ACCEPTED ANSWER
Score 13
$ file f1 f2 f3
f1: ASCII text, with CRLF, LF line terminators
f2: ASCII text, with CRLF line terminators
f3: ASCII text
If you feel it necessary to check every line in the file, you can do this:
$ grep -c "^M" f1 f2
f1:0
f2:3
$ wc -l f1 f2
3 f1
3 f2
6 total
The "^M" was entered using Ctrl+V Ctrl+M and is the ASCII carriage-return (CR) character.
Here we see that file f1 has three lines but no CRs so all line endings must be Unix style solo LFs.
File f2 has equal numbers of lines and CRs so it is reasonable to guess that it uses the CR,LF line-endings as used by MS-DOS and Windows.
ANSWER 2
Score 2
On Windows, a quick way to tell is to open your file in Notepad. Notepad will show line-breaks only on windows style terminations (CR+LF), and not unix terminations (LF). So your unix text will look like this:
Line1Line2Line3Line4
whereas, windows text will look like this:
line1
line2
line3
line4
I'm not much familiar with unix/linux platform, but I'm sure you can use similar hacks with programs like gedit or emacs.
ANSWER 3
Score 0
c=($(perl -0777ne 'print $_ =~ tr/\n//; print " ";
print $_ =~ tr/\r//;'))
if ((!(c[0] + c[1]))) ;then echo no line endings
elif (( c[0] && !c[1] )) ;then echo LF
elif (( !c[0] && c[1] )) ;then echo CR
elif (( c[0] == c[1] )) ;then echo CRLF
else echo "ambiguous LF ${c[0]} CR ${c[1]}"
fi
Note, that for speed's sake, only individual \r
s and \n
s are counted, but it would be a pretty whacky file that had an equal number of both types and yet was not a Windows CRLF file...
Also note that the *nix tool file
does not do a complete scan of the file, whereas this perl
script does. You haven't mentioned which platform you wish it to run on; I have used bash
script to test perl's output, but that can be changed to Window cmd
script.
You can just pipe your file to it.
ANSWER 4
Score 0
PowerShell is built into Windows and is available for all other major platforms so you can use it to detect format like this
('LF', 'CRLF')[([regex]::Matches($(gc -Ra path\to\file.txt), "\r?\n") | group -P Length).Group[0].Value.Length - 1]
If you want to make it work for mixed CRLF files then you need to use the more complete solution below
$content = Get-Content -Raw path\to\file.txt
[regex]::Matches($content, "\r?\n") | Group-Object -Property Length `
| Tee-Object -Variable newlines
if ($newlines.Length -eq 2) {
echo "Mixed CRLF"
} else {
if ($newlines[0].Group[0].Value.Length -eq 2) {
echo "CRLF"
} else {
echo "LF"
}
}
Also note that I'm assuming there are only CRLF and LF like git's behavior. To make it work for CR files you'll need some small changes
Another solution:
$content = Get-Content -Raw -Encoding Byte .\path\to\file.txt
$cr = 0; $lf = 0
foreach ($c in $content) { if ($c -eq 10) { $lf++ } elseif ($c -eq 13) { $cr++ } }
echo "CR = $cr, LF = $lf"