bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: count


From: John McKown
Subject: Re: count
Date: Mon, 21 Dec 2015 08:21:04 -0600

On Fri, Dec 18, 2015 at 7:05 PM, Krem <valkrem@yahoo.com> wrote:
Hi all,

I have one folder and this folder contains several folders. Each sub folders
contains 5 or 6 files.   So i want count the number of  rows within  each
file  and produce an output.

Assume the main folder called A   and it  has three subfolders   folder1,
folder2 and folder3.
Folder1 has 4 files:   file1, file2,  file3   and file4.

The same thing is for folder2 and folder3.
Assume that file1 has 36 rows ( wc -l file1) =36.
Assume that file2 has 50 rows ( wc -l file2) =50.
Assume that file3 has 36 rows ( wc -l file3) =120.
Assume that file4 has 50 rows ( wc -l file4) =15.


I want the output
mainfolder  subfiolder   filename   # of rows

A               Folder1        file1            36
A               Folder1       file2             50
A               Folder1       file3            120
A               Folder1       file4            15
A               folder 2     filename1         ..
..
..
..
A            last_folder    lasfilename    ... .

Can anyone help me out?

Thanks in advance

​Krem,

If you want something "entirely different", the following is "simpler", but more complicated to understand.

find . -maxdepth 2 -mindepth 2 -type f -name '*.csv' -o -name '*.txt' |\
egrep '^\./[0-9]' |\
xargs awk 'ENDFILE {print FILENAME "\t" FNR;}' |\
sed -r 's|^./||;s|/|\t|'   |\
xargs -L 1 echo -e "${PWD##*/}\t"​

The "find" finds the files ending with .txt or .csv in the directories directly below this directory.

The egrep subsets those files to those in directories whose name starts with a digit (followed by anything or nothing). "^\./" matches the _junk_ that "find" puts in the front.

xargs groups the files name together and presents them as the arguments to the "awk" command.

The "awk" program prints the name of the file (complete with sub-directory) with the FILENAME variable, and the number of records from the FNR variable.

The "sed" removes the "./" from the file name which "find" had put there, and changes the "/" in the FILENAME to a tab to separate the directory name from the base file name (note this only works properly in this case! due to the restriction that the files be in the directories directly below this directory and no lower, in will _fail_ in the generic case!)

The last "xargs -L 1" creates the "echo" commands to actually output the data. The "-L 1" says to invoke a separate "echo" for each line of input, instead of xargs' norm of putting all together as a single line. This is needed to properly format the output. The ${PWD##*/} reduces current directory path, from the root, to just the name of this directory with its parent directory.

​This is "more elegant" (in my mathematically twisted mind) than my previous solution. But it is a bit more difficult to really grasp, until you really get used to BASH. Or unless you mind has already been damaged by doing programming in the APL language as my has [grin].​ Hum, SQL set logic might cause similar "damage".


--
Computer Science is the only discipline in which we view adding a new wing to a building as being maintenance -- Jim Horning

Schrodinger's backup: The condition of any backup is unknown until a restore is attempted.

Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be.

He's about as useful as a wax frying pan.

10 to the 12th power microphones = 1 Megaphone

Maranatha! <><
John McKown

reply via email to

[Prev in Thread] Current Thread [Next in Thread]