How to extract a complete list of extension types within a directory?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: City Beneath the Waves Looping
--
Chapters
00:00 How To Extract A Complete List Of Extension Types Within A Directory?
00:30 Accepted Answer Score 40
02:23 Answer 2 Score 38
03:38 Answer 3 Score 4
03:58 Answer 4 Score 4
04:13 Answer 5 Score 0
04:59 Thank you
--
Full question
https://superuser.com/questions/397943/h...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#windowsxp #script #shellscript #batchfile #fileextension
#avk47
ACCEPTED ANSWER
Score 40
This batch script will do it.
@echo off
set target=%~1
if "%target%"=="" set target=%cd%
setlocal EnableDelayedExpansion
set LF=^
rem Previous two lines deliberately left blank for LF to work.
for /f "tokens=*" %%i in ('dir /b /s /a:-d "%target%"') do (
set ext=%%~xi
if "!ext!"=="" set ext=FileWithNoExtension
echo !extlist! | find "!ext!:" > nul
if not !ERRORLEVEL! == 0 set extlist=!extlist!!ext!:
)
echo %extlist::=!LF!%
endlocal
Save it as any .bat
file, and run it with the command batchfile
(substitute whatever you named it) to list the current directory, or specify a path with batchfile "path"
. It will search all subdirectories.
If you want to export to a file, use batchfile >filename.txt
(or batchfile "path" >filename.txt
).
Explanation
Everything before the for /f...
line just sets things up: it gets the target directory to search, enables delayed expansion which lets me do update variables in the loop and defines a newline (LF
) that I can use for neater output. Oh, and the %~1
means "get the first argument, removing quotes" which prevents doubled-up quotes - see for /?
.
The loop uses that dir /b /s /a:-d "%target%"
command, grabbing a list of all files in all subdirectories under the target.
%%~xi
extracts the extension out of the full paths the dir
command returns.
An empty extension is replaced with "FileWithNoExtension", so you know there is such a file - if I added an empty line instead, it's not quite as obvious.
The whole current list if sent through a find
command, to ensure uniqueness. The text output of the find command is sent to nul
, essentially a black hole - we don't want it. Since we always append a :
at the end of the list, we should also make sure the search query ends with a :
so it doesn't match partial results - see comments.
%ERRORLEVEL% is set by the find
command, a value of 0 indicates there was a match. So if it's not 0, the current extension is not on the list so far and should be added.
The echo line basically outputs, and I also replace my placeholders (:
) with newlines to make it look nice.
ANSWER 2
Score 38
Although not strictly meeting the requirement for a batch script, I have used a single-line piped PowerShell
script:
Get-Childitem C:\MyDirectory -Recurse -File | Group Extension -NoElement | Sort Count -Desc > FileExtensions.txt
Where:
Get-ChildItem C:\MyDirectory -Recurse
retrieves all files in the directory and subdirectories.Group Extension -NoElement
groups the results by the file extension.Sort Count -Desc
Sorts the results by the number of matching extensions in each group (from most to least).> FileExtensions.txt
pipes the results to the specified file.
You could potentially run it from the command line/batch file:
Powershell -Command "& Get-Childitem C:\MyDirectory -Recurse -File | Group Extension -NoElement | Sort Count -Desc > FileExtensions.txt"
If you remove C:\MyDirectory
it will execute in the current directory.
Edit 2021-04-20: As per the comment from @ManSamVampire, if you want to find hidden files as well, you should add -Force
before -Recurse
in the above command.
At the end it will produce a FileExtensions.txt containing something like the following:
+-------+------+
| Count | Name |
+-------+------+
| ----- | ---- |
| 8216 | .xml |
| 4854 | .png |
| 4378 | .dll |
| 3565 | .htm |
| ... | ... |
+-------+------+
Depending on your folder structure, you may occasionally get errors notifying you that you have a long path.
Get-ChildItem : The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.
Any subdirectories in there will also not be parsed but the results for everything else will still show.
Notes
You will of course need PowerShell which you can grab from here. It can also run on multiple operating systems.
ANSWER 3
Score 4
Here's a detailed answer using PowerShell (with Windows XP you'll have to install PowerShell):
ANSWER 4
Score 4
To list all unique extensions from cmd under the path your on use:
Powershell -Command "Get-ChildItem . -Include *.* -Recurse | Select-Object Extension | Sort-Object -Property Extension -Unique"
ANSWER 5
Score 0
I found it useful to change
if "!ext!"=="" set ext=FileWithNoExtension
to
if "!ext!"=="" set ext=.FileWithNoExtension
and to change
echo %extlist::=!LF!%
to
echo %extlist::=!LF!% > ext-list.txt
The generated file contained (no linefeeds, but no matter) .bat.pdf.skp.ai.png.jpg.tif.pcp.txt.lst.ttf.dfont.psd.indd.docx.PDF.JPG.gif.jpeg.dwg.exr.FileWithNoExtension.vrlmap.sat.bak.ctb
which I was then able to use for my project.