Replace data between quotes in a fileSed subtitute pattern with commandRemove comma between the quotes only in a comma delimited fileExtract value between double quotesReplace data at specific positions in txt file using data from another fileMerge lines between keywords into one-line comma separated valuesAdd double quote if there is white space between words in columnextract words between patternRemove comma outside quotesReplace only certain double quotes in data fileConvert data from LDIF file to CSVRemove Multiple TABS in between data
Can a character who casts Shapechange and turns into a spellcaster use innate spellcasting to cast spells with a long casting time?
Tikzcd pullback square issue
Can we use other things than single-word verbs in our dialog tags?
English - Acceptable use of parentheses in an author's name
Short story about a teenager who has his brain replaced with a microchip (Psychological Horror)
Does this Foo machine halt?
In the movie Harry Potter and the Order or the Phoenix, why didn't Mr. Filch succeed to open the Room of Requirement if it's what he needed?
What happen if I gain the control of aura that enchants an opponent's creature? Would the aura stay attached?
Why was CPU32 core created, and how is it different from 680x0 CPU cores?
Why should public servants be apolitical?
Non-OR journals which regularly publish OR research
Did WWII Japanese soldiers engage in cannibalism of their enemies?
How to say "fit" in Latin?
Pandas: fill one column with count of # of obs between occurrences in a 2nd column
Is refreshing multiple times a test case for web applications?
Is this cheap "air conditioner" able to cool a room?
How to identify the wires on the dimmer to convert it to Conventional on/off switch
Can I call myself an assistant professor without a PhD
Traveling from Germany to other countries by train?
Can you use the Fly spell to move underwater at a speed of 60 feet?
How can I tell if a flight itinerary is fake?
Looking for a new job because of relocation - is it okay to tell the real reason?
Do other countries guarantee freedoms that the United States does not have?
Atari ST DRAM timing puzzle
Replace data between quotes in a file
Sed subtitute pattern with commandRemove comma between the quotes only in a comma delimited fileExtract value between double quotesReplace data at specific positions in txt file using data from another fileMerge lines between keywords into one-line comma separated valuesAdd double quote if there is white space between words in columnextract words between patternRemove comma outside quotesReplace only certain double quotes in data fileConvert data from LDIF file to CSVRemove Multiple TABS in between data
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I want to extract data between " " from a data file having delimiter as comma.
Sample input file:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
Expected o/p:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
text-processing awk sed
|
show 4 more comments
I want to extract data between " " from a data file having delimiter as comma.
Sample input file:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
Expected o/p:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
text-processing awk sed
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like1,000,000? (3) How far did your own attempts lead?
– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago
|
show 4 more comments
I want to extract data between " " from a data file having delimiter as comma.
Sample input file:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
Expected o/p:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
text-processing awk sed
I want to extract data between " " from a data file having delimiter as comma.
Sample input file:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
Expected o/p:
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
text-processing awk sed
text-processing awk sed
edited 9 hours ago
Jesse_b
18.5k3 gold badges46 silver badges86 bronze badges
18.5k3 gold badges46 silver badges86 bronze badges
asked 10 hours ago
RamkumarRamkumar
383 bronze badges
383 bronze badges
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like1,000,000? (3) How far did your own attempts lead?
– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago
|
show 4 more comments
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like1,000,000? (3) How far did your own attempts lead?
– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like
1,000,000? (3) How far did your own attempts lead?– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like
1,000,000? (3) How far did your own attempts lead?– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago
|
show 4 more comments
5 Answers
5
active
oldest
votes
Another awk solution:
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 )
gsub(/,/, "", $i)
1' input.csv
This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.
In use:
$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ )
> if ( i % 2 == 0 )
> gsub(/,/, "", $i)
>
>
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,
NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 && $i ~ /[0-9]/ )
gsub(/,/, "", $i)
1' input.csv
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas betweenJack, Mary, and Janeotherwise that will be split into 3 separate fields when read as a CSV.
– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
|
show 5 more comments
Try for example awk:
cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile
This works for large numbers, too.
Explanation:
awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.
The awk program looks like this (different formatting):
print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
"\1\2\3\4",
"g");
The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).
What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.
In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
|
show 3 more comments
Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:
$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
The output does not have quotes around the field that we modified, but that's because it no longer needs it.
Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:
$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
add a comment |
Try this:
sed 's/(".*),(.*")/12/' file
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
add a comment |
Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.
This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like
sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'
The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.
Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.
This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f534718%2freplace-data-between-quotes-in-a-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
5 Answers
5
active
oldest
votes
5 Answers
5
active
oldest
votes
active
oldest
votes
active
oldest
votes
Another awk solution:
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 )
gsub(/,/, "", $i)
1' input.csv
This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.
In use:
$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ )
> if ( i % 2 == 0 )
> gsub(/,/, "", $i)
>
>
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,
NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 && $i ~ /[0-9]/ )
gsub(/,/, "", $i)
1' input.csv
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas betweenJack, Mary, and Janeotherwise that will be split into 3 separate fields when read as a CSV.
– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
|
show 5 more comments
Another awk solution:
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 )
gsub(/,/, "", $i)
1' input.csv
This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.
In use:
$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ )
> if ( i % 2 == 0 )
> gsub(/,/, "", $i)
>
>
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,
NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 && $i ~ /[0-9]/ )
gsub(/,/, "", $i)
1' input.csv
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas betweenJack, Mary, and Janeotherwise that will be split into 3 separate fields when read as a CSV.
– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
|
show 5 more comments
Another awk solution:
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 )
gsub(/,/, "", $i)
1' input.csv
This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.
In use:
$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ )
> if ( i % 2 == 0 )
> gsub(/,/, "", $i)
>
>
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,
NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 && $i ~ /[0-9]/ )
gsub(/,/, "", $i)
1' input.csv
Another awk solution:
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 )
gsub(/,/, "", $i)
1' input.csv
This will use the double quote as a field separator and loop through all fields. If the field number is an even number (which is not fool-proof, but given your example it should mean that the field exists between quotes) it will remove any commas from that field. The 1 will cause awk to print everything (with the changes made) using the double quote as the output field separator.
In use:
$ cat input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10,000,000",8/13/2019,
,7/30/2019,7/31/2019,"Jack, Mary, and Jane",8/1/2019,"123,456,789,012,345,678","10,000",8/13/2019,
$ awk -F" '
> OFS=""";
> for ( i = 1; i <= NF; i++ )
> if ( i % 2 == 0 )
> gsub(/,/, "", $i)
>
>
> 1' input.csv
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10000",8/13/2019,
,7/30/2019,7/31/2019,"100",FH/FN 30yr & 20yr TBA & Spec ,"10000000",8/13/2019,
,7/30/2019,7/31/2019,"Jack Mary and Jane",8/1/2019,"123456789012345678","10000",8/13/2019,
NOTE: This will remove the commas in fields that are not numbers. In order to read this file correctly as a csv you will need to do that. If for some reason you want to retain those commas you can use the below solution.
awk -F" '
OFS=""";
for ( i = 1; i <= NF; i++ )
if ( i % 2 == 0 && $i ~ /[0-9]/ )
gsub(/,/, "", $i)
1' input.csv
edited 8 hours ago
answered 9 hours ago
Jesse_bJesse_b
18.5k3 gold badges46 silver badges86 bronze badges
18.5k3 gold badges46 silver badges86 bronze badges
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas betweenJack, Mary, and Janeotherwise that will be split into 3 separate fields when read as a CSV.
– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
|
show 5 more comments
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas betweenJack, Mary, and Janeotherwise that will be split into 3 separate fields when read as a CSV.
– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
This will not work correctly if there are non-numbers in quotation marks. These are not to be changed. Anyway, why write the same thing again I already answered?
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
I thought, the more precise, the better.
– Ned64
9 hours ago
1
1
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
@Ned64: I don't disagree, but this site encourages multiple answers to a single problem.
– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between
Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.– Jesse_b
9 hours ago
Upon thinking about it further, it seems like OP is trying to make this a readable csv file, so having commas within any field would be undesirable, whether that field is a number or not. It seems the correct thing to do is to remove the commas between
Jack, Mary, and Jane otherwise that will be split into 3 separate fields when read as a CSV.– Jesse_b
9 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
No, if the commas are between quotes "" they will be ignored as field delimiters. That's likely the reason why the quotes are there in the first place, even though they are unusual around a number.
– Ned64
8 hours ago
|
show 5 more comments
Try for example awk:
cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile
This works for large numbers, too.
Explanation:
awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.
The awk program looks like this (different formatting):
print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
"\1\2\3\4",
"g");
The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).
What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.
In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
|
show 3 more comments
Try for example awk:
cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile
This works for large numbers, too.
Explanation:
awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.
The awk program looks like this (different formatting):
print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
"\1\2\3\4",
"g");
The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).
What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.
In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
|
show 3 more comments
Try for example awk:
cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile
This works for large numbers, too.
Explanation:
awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.
The awk program looks like this (different formatting):
print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
"\1\2\3\4",
"g");
The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).
What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.
In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.
Try for example awk:
cat oldfile | awk ' print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?","\1\2\3\4","g");' > newfile
This works for large numbers, too.
Explanation:
awk is a programmable filter. The command given here in the commandline (between the outer single quotes "'") will be executed for every line of input from your file.
The awk program looks like this (different formatting):
print gensub ("(,"[0-9]+),([0-9][0-9][0-9]),?([0-9][0-9][0-9])?,?([0-9][0-9][0-9]),?",
"\1\2\3\4",
"g");
The awk-builtin command gensub replaces things given in the first argument, with the replacement given in the second. If the third argument is a string starting with "g" or "G" it will replace all occurrences (tries until no more are found).
What is replaced? The first argument is a regular expression (q.v.) in double quotes, here are the parts: , then afterwards [0-9]+ which means a digit 0-9 repeated one or more times (postfix operator +) then , which is just a character, then [0-9][0-9][0-9] and a comma , followed by a question mark ? (you know what the first part means now but the postfix ? is new - the comma digits can be omitted). Then more digit groups and commas which may be omitted - this is for larger numbers.
In this explanation I have left out the parentheses ( and ) so far! These mark those things that are matched by the expression but remembered. In the second argument to gensub we reference the first 1 through fourth 4 things that were matched (the numbers) and print them out again here.
edited 8 hours ago
answered 9 hours ago
Ned64Ned64
2,8561 gold badge14 silver badges39 bronze badges
2,8561 gold badge14 silver badges39 bronze badges
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
|
show 3 more comments
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
Thanks this worked. But i am not able to understand how it is working. Could you please explain?
– Ramkumar
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
OK, but only if you mark then answer as working :-)
– Ned64
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
:-) :-) I did.. Of course i will do. :-)
– Ramkumar
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Ramkumar Explanation OK?
– Ned64
9 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
@Jesse_b Thanks. That's why I asked about the maximum number in the beginning.
– Ned64
8 hours ago
|
show 3 more comments
Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:
$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
The output does not have quotes around the field that we modified, but that's because it no longer needs it.
Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:
$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
add a comment |
Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:
$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
The output does not have quotes around the field that we modified, but that's because it no longer needs it.
Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:
$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
add a comment |
Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:
$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
The output does not have quotes around the field that we modified, but that's because it no longer needs it.
Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:
$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
Assuming this is properly formatted CSV (the example data looks ok in this respect), we can use csvformat from csvkit to temporarily change the field delimiters to some other character not otherwise present in the data, such as @, delete all commas, and change the field delimiter back to the default again:
$ csvformat -D '@' file.csv | tr -d , | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
The output does not have quotes around the field that we modified, but that's because it no longer needs it.
Obviously, "deleting all commas" may delete commas that we don't actually want to delete, so we can be a bit more selective and only delete the commas in the 7th field:
$ csvformat -D '@' file.csv | awk -F '@' 'BEGIN OFS=FS gsub(",", "", $7); print ' | csvformat -d '@'
,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,10000,8/13/2019,
edited 7 hours ago
answered 8 hours ago
Kusalananda♦Kusalananda
158k18 gold badges313 silver badges499 bronze badges
158k18 gold badges313 silver badges499 bronze badges
add a comment |
add a comment |
Try this:
sed 's/(".*),(.*")/12/' file
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
add a comment |
Try this:
sed 's/(".*),(.*")/12/' file
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
add a comment |
Try this:
sed 's/(".*),(.*")/12/' file
Try this:
sed 's/(".*),(.*")/12/' file
edited 9 hours ago
answered 10 hours ago
guillermo chamorroguillermo chamorro
37412 bronze badges
37412 bronze badges
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
add a comment |
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
This will only replace one comma and will make a mess when there are more double quotes on that line.
– Philippos
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos Agreed with the comma, I'll keep working and edit if necessary, although OP data does not show any other double quotes.
– guillermo chamorro
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
@Philippos If it works for the data the OP has it could be OK? I tried a more specific answer, let's see what Ramkumar thinks.
– Ned64
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
As Philippos said if i have another data after "10,000", it failed. But thanks for responding. I am taking Ned64 syntax. $ cat asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019,"ram", $ sed 's/(".*),(.*")/12/' asdf ,7/30/2019,7/31/2019,"Wed",8/1/2019,"FH/FN 30yr & 20yr TBA & Spec" ,"10,000",8/13/2019"ram",
– Ramkumar
9 hours ago
add a comment |
Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.
This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like
sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'
The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.
Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.
This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"
add a comment |
Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.
This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like
sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'
The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.
Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.
This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"
add a comment |
Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.
This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like
sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'
The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.
Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.
This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"
Your own attempt sed '/"/,/"/s/,//' fails because the address range you give only filters for a range of lines, not a range inside a line.
This type of task is nasty in standard sed. If it's just about one comma, then sed -E 's/("[0-9]*),([0-9]*")/1 2/ would do the trick, but for multiple comma you'd have to loop, giving ugly results like
sed -Ee :loop -e 's/("[0-9 ]*),([^"]*")/1 2/;tloop'
The ("[0-9]*) matches the opening double quote followed by any number of digits and will be referred to as 1 in the replacement, the ([^"]*") matches anything after the comma until the closing ", so 1 2 is the same, but with the first comma replaced.
Now the t command branches to the loop mark if a replacement was made. This gets repeated until there is no comma left to be replaced.
This even works for cases with more than one number with as many commas as you like: ,7/30/2019,"99,999,999,999,999",0,1 ,"10,000","foo, bar" will get transformed to ,7/30/2019,"99 999 999 999 999" 0 1 "10 000" "foo, bar"
edited 7 hours ago
answered 8 hours ago
PhilipposPhilippos
7,0591 gold badge20 silver badges51 bronze badges
7,0591 gold badge20 silver badges51 bronze badges
add a comment |
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f534718%2freplace-data-between-quotes-in-a-file%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
(1) Will there only be one pair of double quotes in one line? (2) Can there be higher numbers with multiple comma like
1,000,000? (3) How far did your own attempts lead?– Philippos
10 hours ago
(1) Will there only be one pair of double quotes in one line? No , there can be many, but i am okay to replace the comma in between " " to a blank (2) Can there be higher numbers with multiple comma like 1,000,000? Yes. (3) How far did your own attempts lead? $ cat asdf ,7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $ sed '/"/,/"/s/,//' asdf 7/30/2019,7/31/2019,Wed,8/1/2019,FH/FN 30yr & 20yr TBA & Spec ,"10,000",8/13/2019, $
– Ramkumar
10 hours ago
Do you need to replace all numbers with a "," in them?
– Ned64
10 hours ago
I need to replace them with blank. "10,000" to "10000"
– Ramkumar
10 hours ago
Is there a limit on the numbers (e.g. can 12,000,000,000,000 occur?), how many "," max?
– Ned64
10 hours ago