Replace data between quotes in a fileSed subtitute pattern with commandRemove comma between the quotes only in a comma delimited fileExtract value between double quotesReplace data at specific positions in txt file using data from another fileMerge lines between keywords into one-line comma separated valuesAdd double quote if there is white space between words in columnextract words between patternRemove comma outside quotesReplace only certain double quotes in data fileConvert data from LDIF file to CSVRemove Multiple TABS in between data

Can a character who casts Shapechange and turns into a spellcaster use innate spellcasting to cast spells with a long casting time?

Tikzcd pullback square issue

Can we use other things than single-word verbs in our dialog tags?

English - Acceptable use of parentheses in an author's name

Short story about a teenager who has his brain replaced with a microchip (Psychological Horror)

Does this Foo machine halt?

In the movie Harry Potter and the Order or the Phoenix, why didn't Mr. Filch succeed to open the Room of Requirement if it's what he needed?

What happen if I gain the control of aura that enchants an opponent's creature? Would the aura stay attached?

Why was CPU32 core created, and how is it different from 680x0 CPU cores?

Why should public servants be apolitical?

Non-OR journals which regularly publish OR research

Did WWII Japanese soldiers engage in cannibalism of their enemies?

How to say "fit" in Latin?

Pandas: fill one column with count of # of obs between occurrences in a 2nd column

Is refreshing multiple times a test case for web applications?

Is this cheap "air conditioner" able to cool a room?

How to identify the wires on the dimmer to convert it to Conventional on/off switch

Can I call myself an assistant professor without a PhD

Traveling from Germany to other countries by train?

Can you use the Fly spell to move underwater at a speed of 60 feet?

How can I tell if a flight itinerary is fake?

Looking for a new job because of relocation - is it okay to tell the real reason?

Do other countries guarantee freedoms that the United States does not have?

Atari ST DRAM timing puzzle

Replace data between quotes in a file

Sed subtitute pattern with commandRemove comma between the quotes only in a comma delimited fileExtract value between double quotesReplace data at specific positions in txt file using data from another fileMerge lines between keywords into one-line comma separated valuesAdd double quote if there is white space between words in columnextract words between patternRemove comma outside quotesReplace only certain double quotes in data fileConvert data from LDIF file to CSVRemove Multiple TABS in between data

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;

I want to extract data between " " from a data file having delimiter as comma.

Sample input file:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,

Expected o/p:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like 1,000,000? (3) How far did your own attempts lead?

– Philippos
10 hours ago

(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $

– Ramkumar
10 hours ago

Do you need to replace all numbers with a "," in them?

– Ned64
10 hours ago

I need to replace them with blank. "10,000" to "10000"

– Ramkumar
10 hours ago

Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?

– Ned64
10 hours ago

|
show 4 more comments

I want to extract data between " " from a data file having delimiter as comma.

Sample input file:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,

Expected o/p:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like 1,000,000? (3) How far did your own attempts lead?

– Philippos
10 hours ago

(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $

– Ramkumar
10 hours ago

Do you need to replace all numbers with a "," in them?

– Ned64
10 hours ago

I need to replace them with blank. "10,000" to "10000"

– Ramkumar
10 hours ago

Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?

– Ned64
10 hours ago

|
show 4 more comments

I want to extract data between " " from a data file having delimiter as comma.

Sample input file:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,

Expected o/p:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

I want to extract data between " " from a data file having delimiter as comma.

Sample input file:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,

Expected o/p:

,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,

text-processing awk sed

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

edited 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

asked 10 hours ago

Ramkumar

383 bronze badges

(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like 1,000,000? (3) How far did your own attempts lead?

– Philippos
10 hours ago

(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $

– Ramkumar
10 hours ago

Do you need to replace all numbers with a "," in them?

– Ned64
10 hours ago

I need to replace them with blank. "10,000" to "10000"

– Ramkumar
10 hours ago

Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?

– Ned64
10 hours ago

|
show 4 more comments

(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like 1,000,000? (3) How far did your own attempts lead?

– Philippos
10 hours ago

(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $

– Ramkumar
10 hours ago

Do you need to replace all numbers with a "," in them?

– Ned64
10 hours ago

I need to replace them with blank. "10,000" to "10000"

– Ramkumar
10 hours ago

Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?

– Ned64
10 hours ago

(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like 1,000,000? (3) How far did your own attempts lead?

– Philippos
10 hours ago

(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $

– Ramkumar
10 hours ago

Do you need to replace all numbers with a "," in them?

– Ned64
10 hours ago

I need to replace them with blank. "10,000" to "10000"

– Ramkumar
10 hours ago

Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?

– Ned64
10 hours ago

|
show 4 more comments

5 Answers
5

active

oldest

votes

Another awk solution:

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.

In use:

$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ ) 
> if ( i % 2 == 0 ) 
> gsub(/,/, "", $i)
> 
> 
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,

NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 && $i ~ /[0-9]/ ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

1

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

|
show 5 more comments

Try for example awk:

cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile

This works for large numbers, too.

Explanation:

awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.

The awk program looks like this (different formatting):


 print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
 "\1\2\3\4",
 "g");

The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).

What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.

In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

|
show 3 more comments

Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:

$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

The output does not have quotes around the field that we modified, but that's because it no longer needs it.

Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:

$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

add a comment |

Try this:

sed 's/(".*),(.*")/12/' file

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

add a comment |

Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.

This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like

sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'

The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.

Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.

This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f534718%2freplace-data-between-quotes-in-a-file%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

5 Answers
5

active

oldest

votes

5 Answers
5

active

oldest

votes

Another awk solution:

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

In use:

$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ ) 
> if ( i % 2 == 0 ) 
> gsub(/,/, "", $i)
> 
> 
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 && $i ~ /[0-9]/ ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

1

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

|
show 5 more comments

Another awk solution:

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

In use:

$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ ) 
> if ( i % 2 == 0 ) 
> gsub(/,/, "", $i)
> 
> 
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 && $i ~ /[0-9]/ ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

1

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

|
show 5 more comments

Another awk solution:

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

In use:

$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ ) 
> if ( i % 2 == 0 ) 
> gsub(/,/, "", $i)
> 
> 
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 && $i ~ /[0-9]/ ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

Another awk solution:

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

In use:

$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ ) 
> if ( i % 2 == 0 ) 
> gsub(/,/, "", $i)
> 
> 
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,

awk -F" '
 OFS=""";
 for ( i = 1; i <= NF; i++ ) 
 if ( i % 2 == 0 && $i ~ /[0-9]/ ) 
 gsub(/,/, "", $i)
 
 
1' input.csv

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

edited 8 hours ago

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

answered 9 hours ago

Jesse_b

18.5k3 gold badges46 silver badges86 bronze badges

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

1

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

|
show 5 more comments

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

1

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?

– Ned64
9 hours ago

I thought, the more precise, the better.

– Ned64
9 hours ago

@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.

– Jesse_b
9 hours ago

Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.

– Jesse_b
9 hours ago

No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.

– Ned64
8 hours ago

|
show 5 more comments

Try for example awk:

cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile

This works for large numbers, too.

Explanation:

awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.

The awk program looks like this (different formatting):


 print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
 "\1\2\3\4",
 "g");

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

|
show 3 more comments

Try for example awk:

cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile

This works for large numbers, too.

Explanation:

awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.

The awk program looks like this (different formatting):


 print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
 "\1\2\3\4",
 "g");

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

|
show 3 more comments

Try for example awk:

cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile

This works for large numbers, too.

Explanation:

awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.

The awk program looks like this (different formatting):


 print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
 "\1\2\3\4",
 "g");

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

Try for example awk:

cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile

This works for large numbers, too.

Explanation:

awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.

The awk program looks like this (different formatting):


 print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
 "\1\2\3\4",
 "g");

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

edited 8 hours ago

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

answered 9 hours ago

Ned64

2,8561 gold badge14 silver badges39 bronze badges

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

|
show 3 more comments

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

Thanks this worked. But i am not able to understand how it is working. Could you please explain?

– Ramkumar
9 hours ago

OK, but only if you mark then answer as working :-)

– Ned64
9 hours ago

:-) :-) I did.. Of course i will do. :-)

– Ramkumar
9 hours ago

@Ramkumar Explanation OK?

– Ned64
9 hours ago

@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.

– Ned64
8 hours ago

|
show 3 more comments

$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

The output does not have quotes around the field that we modified, but that's because it no longer needs it.

Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:

$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

add a comment |

$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

The output does not have quotes around the field that we modified, but that's because it no longer needs it.

Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:

$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

add a comment |

$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

The output does not have quotes around the field that we modified, but that's because it no longer needs it.

Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:

$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

The output does not have quotes around the field that we modified, but that's because it no longer needs it.

Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:

$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

edited 7 hours ago

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

answered 8 hours ago

Kusalananda♦

158k18 gold badges313 silver badges499 bronze badges

add a comment |

Try this:

sed 's/(".*),(.*")/12/' file

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

add a comment |

Try this:

sed 's/(".*),(.*")/12/' file

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

add a comment |

Try this:

sed 's/(".*),(.*")/12/' file

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

Try this:

sed 's/(".*),(.*")/12/' file

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

edited 9 hours ago

answered 10 hours ago

guillermo chamorro

37412 bronze badges

answered 10 hours ago

guillermo chamorro

37412 bronze badges

answered 10 hours ago

guillermo chamorro

37412 bronze badges

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

add a comment |

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

This will only replace one comma and will make a mess when there are more double quotes on that line.

– Philippos
9 hours ago

@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.

– guillermo chamorro
9 hours ago

@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.

– Ned64
9 hours ago

As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",

– Ramkumar
9 hours ago

add a comment |

Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.

sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'

Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

add a comment |

Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.

sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'

Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

add a comment |

Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.

sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'

Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.

sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'

Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

edited 7 hours ago

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

answered 8 hours ago

Philippos

7,0591 gold badge20 silver badges51 bronze badges

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Mfcttrf

5 Answers
5

Your Answer

Post as a guest

5 Answers
5

5 Answers
5

Post as a guest

Popular posts from this blog

5 Answers 5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

5 Answers 5

5 Answers 5

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

5 Answers
5

5 Answers
5

5 Answers
5