Parsing a string of key-value pairs as a dictionary“Multi-key” dictionaryMatrix Multiplication Python —...
Why avoid shared user accounts?
Why isn't there a non-conducting core wire for high-frequency coil applications
How to choice softserial library use on arduino project?
How do I say "Brexit" in Latin?
Does windows 10s telemetry include sending *.docs if word crashed
Why did other German political parties disband so fast when Hitler was appointed chancellor?
Can I write a book of my D&D game?
Why do no American passenger airlines still operate dedicated cargo flights?
Program that converts a number to a letter of the alphabet
Why do stocks necessarily drop during a recession?
What to do when being responsible for data protection in your lab, yet advice is ignored?
Is there any other number that has similar properties as 21?
Highly technological aliens land nuclear fusion powered ships in medieval city and slaughter everyone, using swords?
Why zero tolerance on nudity in space?
Broken patches on a road
It took me a lot of time to make this, pls like. (YouTube Comments #1)
Why is working on the same position for more than 15 years not a red flag?
Which password policy is more secure: one password of length 9 vs. two passwords each of length 8?
How to prevent users from executing commands through browser URL
Roman Numerals equation 1
Intern applicant asking for compensation equivalent to that of permanent employee
My cat mixes up the floors in my building. How can I help him?
Publishing research using outdated methods
Differentiate between Local and Global Unitaries
Parsing a string of key-value pairs as a dictionary
“Multi-key” dictionaryMatrix Multiplication Python — Memory HungrySearch dictionary by valueLoad recurring (but not strictly identical) sets of Key, Values into a DataFrame from text filesInitializing and populating a Python dictionary, key -> ListList all possible permutations from a python dictionary of listsSort dictionary by increasing length of its valuesInvert a dictionary to a dictionary of listsAccessing a list of dictionaries in a list of dictionariesPytest fixture for testing a vertex-parsing function
$begingroup$
I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.
In [14]: data = """
41:n
43:n
44:n
46:n
47:n
49:n
50:n
51:n
52:n
53:n
54:n
55:cm
56:n
57:n
58:n"""
In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
Out [15]:
{41: 'n',
43: 'n',
44: 'n',
46: 'n',
47: 'n',
49: 'n',
50: 'n',
51: 'n',
52: 'n',
53: 'n',
54: 'n',
55: 'cm',
56: 'n',
57: 'n',
58: 'n'}
Here I am doing line.split(":")[0]
three times. Is there any better way to do this?
python python-3.x parsing dictionary
$endgroup$
add a comment |
$begingroup$
I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.
In [14]: data = """
41:n
43:n
44:n
46:n
47:n
49:n
50:n
51:n
52:n
53:n
54:n
55:cm
56:n
57:n
58:n"""
In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
Out [15]:
{41: 'n',
43: 'n',
44: 'n',
46: 'n',
47: 'n',
49: 'n',
50: 'n',
51: 'n',
52: 'n',
53: 'n',
54: 'n',
55: 'cm',
56: 'n',
57: 'n',
58: 'n'}
Here I am doing line.split(":")[0]
three times. Is there any better way to do this?
python python-3.x parsing dictionary
$endgroup$
add a comment |
$begingroup$
I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.
In [14]: data = """
41:n
43:n
44:n
46:n
47:n
49:n
50:n
51:n
52:n
53:n
54:n
55:cm
56:n
57:n
58:n"""
In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
Out [15]:
{41: 'n',
43: 'n',
44: 'n',
46: 'n',
47: 'n',
49: 'n',
50: 'n',
51: 'n',
52: 'n',
53: 'n',
54: 'n',
55: 'cm',
56: 'n',
57: 'n',
58: 'n'}
Here I am doing line.split(":")[0]
three times. Is there any better way to do this?
python python-3.x parsing dictionary
$endgroup$
I always use nested list and dictionary comprehension for unstructured data and this is a common way I use it.
In [14]: data = """
41:n
43:n
44:n
46:n
47:n
49:n
50:n
51:n
52:n
53:n
54:n
55:cm
56:n
57:n
58:n"""
In [15]: {int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
Out [15]:
{41: 'n',
43: 'n',
44: 'n',
46: 'n',
47: 'n',
49: 'n',
50: 'n',
51: 'n',
52: 'n',
53: 'n',
54: 'n',
55: 'cm',
56: 'n',
57: 'n',
58: 'n'}
Here I am doing line.split(":")[0]
three times. Is there any better way to do this?
python python-3.x parsing dictionary
python python-3.x parsing dictionary
edited 4 hours ago
200_success
130k16153417
130k16153417
asked 8 hours ago
Rahul PatelRahul Patel
237413
237413
add a comment |
add a comment |
3 Answers
3
active
oldest
votes
$begingroup$
You have too much logic in the dict comprehension:
{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
First of all, let's expand it to a normal for-loop:
>>> result = {}
>>> for line in data.split("n"):
... if len(line.split(":"))==2:
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
I can see that you use the following check if len(line.split(":"))==2:
to eliminate the first blank space from the data.split("n")
:
>>> data.split("n")
['',
'41:n',
'43:n',
...
'58:n']
But the docs for str.split
advice to use str.split()
without specifying a sep
parameter if you wanna discard the empty string at the beginning:
>>> data.split()
['41:n',
'43:n',
...
'58:n']
So, now we can remove unnecessary check from your code:
>>> result = {}
>>> for line in data.split():
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
Here you calculate line.split(":")
twice. Take it out:
>>> result = {}
>>> for line in data.split():
... key, value = line.split(":")
... result[int(key)] = value
>>> result
This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:
>>> def to_key_value(line, sep=':'):
... key, value = line.split(sep)
... return int(key), value
>>> dict(map(to_key_value, data.split()))
{41: 'n',
43: 'n',
...
58: 'n'}
Another option that I came up with:
>>> from functools import partial
>>> lines = data.split()
>>> split_by_colon = partial(str.split, sep=':')
>>> key_value_pairs = map(split_by_colon, lines)
>>> {int(key): value for key, value in key_value_pairs}
{41: 'n',
43: 'n',
...
58: 'n'}
Also, if you don't want to keep in memory a list of results from data.split
, you might find this helpful: Is there a generator version of string.split()
in Python?
$endgroup$
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
add a comment |
$begingroup$
There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:
In [10]: import re
In [11]: data = """
...: 41:n
...: 43:n
...: 44:n
...: 46:n
...: 47:n
...: 49:n
...: 50:n
...: 51:n
...: 52:n
...: 53:n
...: 54:n
...: 55:cm
...: 56:n
...: 57:n
...: 58:n"""
In [12]: dict(re.findall(r'(d+):(.*)', data))
Out[12]:
{'41': 'n',
'43': 'n',
'44': 'n',
'46': 'n',
'47': 'n',
'49': 'n',
'50': 'n',
'51': 'n',
'52': 'n',
'53': 'n',
'54': 'n',
'55': 'cm',
'56': 'n',
'57': 'n',
'58': 'n'}
Explanation:
1st Capturing Group (d+)
:
d+
- matches a digit (equal to [0-9])+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy):
matches the character :
literally (case sensitive)
2nd Capturing Group (.*)
:
.*
matches any character (except for line terminators)*
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
If there might be letters in the first matching group (though I doubt it since your casting that to an int
), you might want to use:
dict(re.findall(r'(.*):(.*)', data))
I usually prefer using split()
s over regex
es because I feel like I have more control over the functionality of the code.
You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.
Regarding the comment of @Rahul regarding speed I'd say it depends:
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
$endgroup$
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
add a comment |
$begingroup$
Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line
You could use unpacking to remove some usages of line.split
>>> {
... int(k): v
... for line in data.split()
... for k, v in (line.split(':'),)
... }
{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}
Or if the first argument can be of str
type you could use dict()
.
This will unpack the line.split
and convert them into a key, value pair for you
>>> dict(
... line.split(':')
... for line in data.split()
... )
{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}
$endgroup$
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214510%2fparsing-a-string-of-key-value-pairs-as-a-dictionary%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You have too much logic in the dict comprehension:
{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
First of all, let's expand it to a normal for-loop:
>>> result = {}
>>> for line in data.split("n"):
... if len(line.split(":"))==2:
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
I can see that you use the following check if len(line.split(":"))==2:
to eliminate the first blank space from the data.split("n")
:
>>> data.split("n")
['',
'41:n',
'43:n',
...
'58:n']
But the docs for str.split
advice to use str.split()
without specifying a sep
parameter if you wanna discard the empty string at the beginning:
>>> data.split()
['41:n',
'43:n',
...
'58:n']
So, now we can remove unnecessary check from your code:
>>> result = {}
>>> for line in data.split():
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
Here you calculate line.split(":")
twice. Take it out:
>>> result = {}
>>> for line in data.split():
... key, value = line.split(":")
... result[int(key)] = value
>>> result
This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:
>>> def to_key_value(line, sep=':'):
... key, value = line.split(sep)
... return int(key), value
>>> dict(map(to_key_value, data.split()))
{41: 'n',
43: 'n',
...
58: 'n'}
Another option that I came up with:
>>> from functools import partial
>>> lines = data.split()
>>> split_by_colon = partial(str.split, sep=':')
>>> key_value_pairs = map(split_by_colon, lines)
>>> {int(key): value for key, value in key_value_pairs}
{41: 'n',
43: 'n',
...
58: 'n'}
Also, if you don't want to keep in memory a list of results from data.split
, you might find this helpful: Is there a generator version of string.split()
in Python?
$endgroup$
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
add a comment |
$begingroup$
You have too much logic in the dict comprehension:
{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
First of all, let's expand it to a normal for-loop:
>>> result = {}
>>> for line in data.split("n"):
... if len(line.split(":"))==2:
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
I can see that you use the following check if len(line.split(":"))==2:
to eliminate the first blank space from the data.split("n")
:
>>> data.split("n")
['',
'41:n',
'43:n',
...
'58:n']
But the docs for str.split
advice to use str.split()
without specifying a sep
parameter if you wanna discard the empty string at the beginning:
>>> data.split()
['41:n',
'43:n',
...
'58:n']
So, now we can remove unnecessary check from your code:
>>> result = {}
>>> for line in data.split():
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
Here you calculate line.split(":")
twice. Take it out:
>>> result = {}
>>> for line in data.split():
... key, value = line.split(":")
... result[int(key)] = value
>>> result
This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:
>>> def to_key_value(line, sep=':'):
... key, value = line.split(sep)
... return int(key), value
>>> dict(map(to_key_value, data.split()))
{41: 'n',
43: 'n',
...
58: 'n'}
Another option that I came up with:
>>> from functools import partial
>>> lines = data.split()
>>> split_by_colon = partial(str.split, sep=':')
>>> key_value_pairs = map(split_by_colon, lines)
>>> {int(key): value for key, value in key_value_pairs}
{41: 'n',
43: 'n',
...
58: 'n'}
Also, if you don't want to keep in memory a list of results from data.split
, you might find this helpful: Is there a generator version of string.split()
in Python?
$endgroup$
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
add a comment |
$begingroup$
You have too much logic in the dict comprehension:
{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
First of all, let's expand it to a normal for-loop:
>>> result = {}
>>> for line in data.split("n"):
... if len(line.split(":"))==2:
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
I can see that you use the following check if len(line.split(":"))==2:
to eliminate the first blank space from the data.split("n")
:
>>> data.split("n")
['',
'41:n',
'43:n',
...
'58:n']
But the docs for str.split
advice to use str.split()
without specifying a sep
parameter if you wanna discard the empty string at the beginning:
>>> data.split()
['41:n',
'43:n',
...
'58:n']
So, now we can remove unnecessary check from your code:
>>> result = {}
>>> for line in data.split():
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
Here you calculate line.split(":")
twice. Take it out:
>>> result = {}
>>> for line in data.split():
... key, value = line.split(":")
... result[int(key)] = value
>>> result
This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:
>>> def to_key_value(line, sep=':'):
... key, value = line.split(sep)
... return int(key), value
>>> dict(map(to_key_value, data.split()))
{41: 'n',
43: 'n',
...
58: 'n'}
Another option that I came up with:
>>> from functools import partial
>>> lines = data.split()
>>> split_by_colon = partial(str.split, sep=':')
>>> key_value_pairs = map(split_by_colon, lines)
>>> {int(key): value for key, value in key_value_pairs}
{41: 'n',
43: 'n',
...
58: 'n'}
Also, if you don't want to keep in memory a list of results from data.split
, you might find this helpful: Is there a generator version of string.split()
in Python?
$endgroup$
You have too much logic in the dict comprehension:
{int(line.split(":")[0]):line.split(":")[1] for line in data.split("n") if len(line.split(":"))==2}
First of all, let's expand it to a normal for-loop:
>>> result = {}
>>> for line in data.split("n"):
... if len(line.split(":"))==2:
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
I can see that you use the following check if len(line.split(":"))==2:
to eliminate the first blank space from the data.split("n")
:
>>> data.split("n")
['',
'41:n',
'43:n',
...
'58:n']
But the docs for str.split
advice to use str.split()
without specifying a sep
parameter if you wanna discard the empty string at the beginning:
>>> data.split()
['41:n',
'43:n',
...
'58:n']
So, now we can remove unnecessary check from your code:
>>> result = {}
>>> for line in data.split():
... result[int(line.split(":")[0])] = line.split(":")[1]
>>> result
Here you calculate line.split(":")
twice. Take it out:
>>> result = {}
>>> for line in data.split():
... key, value = line.split(":")
... result[int(key)] = value
>>> result
This is the most basic version. Don't put it back to a dict comprehension as it will still look quite complex. But you could make a function out of it. For example, something like this:
>>> def to_key_value(line, sep=':'):
... key, value = line.split(sep)
... return int(key), value
>>> dict(map(to_key_value, data.split()))
{41: 'n',
43: 'n',
...
58: 'n'}
Another option that I came up with:
>>> from functools import partial
>>> lines = data.split()
>>> split_by_colon = partial(str.split, sep=':')
>>> key_value_pairs = map(split_by_colon, lines)
>>> {int(key): value for key, value in key_value_pairs}
{41: 'n',
43: 'n',
...
58: 'n'}
Also, if you don't want to keep in memory a list of results from data.split
, you might find this helpful: Is there a generator version of string.split()
in Python?
answered 1 hour ago
GeorgyGeorgy
1,0162520
1,0162520
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
add a comment |
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
$begingroup$
I said I want solution for list/dict comprehension. Your solution is nice but looks ugly. Thanks.
$endgroup$
– Rahul Patel
58 mins ago
add a comment |
$begingroup$
There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:
In [10]: import re
In [11]: data = """
...: 41:n
...: 43:n
...: 44:n
...: 46:n
...: 47:n
...: 49:n
...: 50:n
...: 51:n
...: 52:n
...: 53:n
...: 54:n
...: 55:cm
...: 56:n
...: 57:n
...: 58:n"""
In [12]: dict(re.findall(r'(d+):(.*)', data))
Out[12]:
{'41': 'n',
'43': 'n',
'44': 'n',
'46': 'n',
'47': 'n',
'49': 'n',
'50': 'n',
'51': 'n',
'52': 'n',
'53': 'n',
'54': 'n',
'55': 'cm',
'56': 'n',
'57': 'n',
'58': 'n'}
Explanation:
1st Capturing Group (d+)
:
d+
- matches a digit (equal to [0-9])+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy):
matches the character :
literally (case sensitive)
2nd Capturing Group (.*)
:
.*
matches any character (except for line terminators)*
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
If there might be letters in the first matching group (though I doubt it since your casting that to an int
), you might want to use:
dict(re.findall(r'(.*):(.*)', data))
I usually prefer using split()
s over regex
es because I feel like I have more control over the functionality of the code.
You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.
Regarding the comment of @Rahul regarding speed I'd say it depends:
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
$endgroup$
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
add a comment |
$begingroup$
There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:
In [10]: import re
In [11]: data = """
...: 41:n
...: 43:n
...: 44:n
...: 46:n
...: 47:n
...: 49:n
...: 50:n
...: 51:n
...: 52:n
...: 53:n
...: 54:n
...: 55:cm
...: 56:n
...: 57:n
...: 58:n"""
In [12]: dict(re.findall(r'(d+):(.*)', data))
Out[12]:
{'41': 'n',
'43': 'n',
'44': 'n',
'46': 'n',
'47': 'n',
'49': 'n',
'50': 'n',
'51': 'n',
'52': 'n',
'53': 'n',
'54': 'n',
'55': 'cm',
'56': 'n',
'57': 'n',
'58': 'n'}
Explanation:
1st Capturing Group (d+)
:
d+
- matches a digit (equal to [0-9])+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy):
matches the character :
literally (case sensitive)
2nd Capturing Group (.*)
:
.*
matches any character (except for line terminators)*
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
If there might be letters in the first matching group (though I doubt it since your casting that to an int
), you might want to use:
dict(re.findall(r'(.*):(.*)', data))
I usually prefer using split()
s over regex
es because I feel like I have more control over the functionality of the code.
You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.
Regarding the comment of @Rahul regarding speed I'd say it depends:
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
$endgroup$
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
add a comment |
$begingroup$
There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:
In [10]: import re
In [11]: data = """
...: 41:n
...: 43:n
...: 44:n
...: 46:n
...: 47:n
...: 49:n
...: 50:n
...: 51:n
...: 52:n
...: 53:n
...: 54:n
...: 55:cm
...: 56:n
...: 57:n
...: 58:n"""
In [12]: dict(re.findall(r'(d+):(.*)', data))
Out[12]:
{'41': 'n',
'43': 'n',
'44': 'n',
'46': 'n',
'47': 'n',
'49': 'n',
'50': 'n',
'51': 'n',
'52': 'n',
'53': 'n',
'54': 'n',
'55': 'cm',
'56': 'n',
'57': 'n',
'58': 'n'}
Explanation:
1st Capturing Group (d+)
:
d+
- matches a digit (equal to [0-9])+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy):
matches the character :
literally (case sensitive)
2nd Capturing Group (.*)
:
.*
matches any character (except for line terminators)*
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
If there might be letters in the first matching group (though I doubt it since your casting that to an int
), you might want to use:
dict(re.findall(r'(.*):(.*)', data))
I usually prefer using split()
s over regex
es because I feel like I have more control over the functionality of the code.
You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.
Regarding the comment of @Rahul regarding speed I'd say it depends:
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
$endgroup$
There's nothing wrong with the solution you have come with, but if you want an alternative, regex might come in handy here:
In [10]: import re
In [11]: data = """
...: 41:n
...: 43:n
...: 44:n
...: 46:n
...: 47:n
...: 49:n
...: 50:n
...: 51:n
...: 52:n
...: 53:n
...: 54:n
...: 55:cm
...: 56:n
...: 57:n
...: 58:n"""
In [12]: dict(re.findall(r'(d+):(.*)', data))
Out[12]:
{'41': 'n',
'43': 'n',
'44': 'n',
'46': 'n',
'47': 'n',
'49': 'n',
'50': 'n',
'51': 'n',
'52': 'n',
'53': 'n',
'54': 'n',
'55': 'cm',
'56': 'n',
'57': 'n',
'58': 'n'}
Explanation:
1st Capturing Group (d+)
:
d+
- matches a digit (equal to [0-9])+
Quantifier — Matches between one and unlimited times, as many times as possible, giving back as needed (greedy):
matches the character :
literally (case sensitive)
2nd Capturing Group (.*)
:
.*
matches any character (except for line terminators)*
Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
If there might be letters in the first matching group (though I doubt it since your casting that to an int
), you might want to use:
dict(re.findall(r'(.*):(.*)', data))
I usually prefer using split()
s over regex
es because I feel like I have more control over the functionality of the code.
You might ask, why would you want to use the more complicated and verbose syntax of regular expressions rather than the more intuitive and simple string methods? Sometimes, the advantage is that regular expressions offer far more flexibility.
Regarding the comment of @Rahul regarding speed I'd say it depends:
Although string manipulation will usually be somewhat faster, the actual performance heavily depends on a number of factors, including:
- How many times you parse the regex
- How cleverly you write your string code
- Whether the regex is precompiled
As the regex gets more complicated, it will take much more effort and complexity to write equivlent string manipulation code that performs well.
As far as I can tell, string operations will almost always beat regular expressions. But the more complex it gets, the harder it will be that string operations can keep up not only in performance matters but also regarding maintenance.
edited 5 hours ago
answered 5 hours ago
яүυкяүυк
7,10122054
7,10122054
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
add a comment |
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
$begingroup$
Yeah. I think regexes are slow too.
$endgroup$
– Rahul Patel
5 hours ago
add a comment |
$begingroup$
Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line
You could use unpacking to remove some usages of line.split
>>> {
... int(k): v
... for line in data.split()
... for k, v in (line.split(':'),)
... }
{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}
Or if the first argument can be of str
type you could use dict()
.
This will unpack the line.split
and convert them into a key, value pair for you
>>> dict(
... line.split(':')
... for line in data.split()
... )
{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}
$endgroup$
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
add a comment |
$begingroup$
Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line
You could use unpacking to remove some usages of line.split
>>> {
... int(k): v
... for line in data.split()
... for k, v in (line.split(':'),)
... }
{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}
Or if the first argument can be of str
type you could use dict()
.
This will unpack the line.split
and convert them into a key, value pair for you
>>> dict(
... line.split(':')
... for line in data.split()
... )
{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}
$endgroup$
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
add a comment |
$begingroup$
Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line
You could use unpacking to remove some usages of line.split
>>> {
... int(k): v
... for line in data.split()
... for k, v in (line.split(':'),)
... }
{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}
Or if the first argument can be of str
type you could use dict()
.
This will unpack the line.split
and convert them into a key, value pair for you
>>> dict(
... line.split(':')
... for line in data.split()
... )
{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}
$endgroup$
Note it is much easier to read if you chop up the comprehension into blocks, instead of having them all on one line
You could use unpacking to remove some usages of line.split
>>> {
... int(k): v
... for line in data.split()
... for k, v in (line.split(':'),)
... }
{41: 'n', 43: 'n', 44: 'n', 46: 'n', 47: 'n', 49: 'n', 50: 'n', 51: 'n', 52: 'n', 53: 'n', 54: 'n', 55: 'cm', 56: 'n', 57: 'n', 58: 'n'}
Or if the first argument can be of str
type you could use dict()
.
This will unpack the line.split
and convert them into a key, value pair for you
>>> dict(
... line.split(':')
... for line in data.split()
... )
{'41': 'n', '43': 'n', '44': 'n', '46': 'n', '47': 'n', '49': 'n', '50': 'n', '51': 'n', '52': 'n', '53': 'n', '54': 'n', '55': 'cm', '56': 'n', '57': 'n', '58': 'n'}
edited 12 mins ago
answered 2 hours ago
LudisposedLudisposed
8,24222161
8,24222161
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
add a comment |
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
$begingroup$
This is great. I was trying this but could nit figure out tuple thing. Thanks
$endgroup$
– Rahul Patel
1 hour ago
add a comment |
Thanks for contributing an answer to Code Review Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f214510%2fparsing-a-string-of-key-value-pairs-as-a-dictionary%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown