What is the most efficient way to compute the difference of lines from two files?
I have two lists in python list_a
and list_b
. The list_a
have some images links, and the list_b
too. 99% of the items are the same, but i have to know this 1%. The all surplus items are in list_a
, that means all items in list_b
are in list_a
. My initial idea is subtract all items:
list_a - list_b = list_c
, where the list_c
are my surplus items. My code is:
list_a =
list_b =
list_c =
arq_b = open('list_b.txt','r')
for b in arq_b:
list_b.append(b)
arq_a = open('list_a.txt','r')
for a in arq_a:
if a not in arq_b:
list_c.append(a)
arq_c = open('list_c.txt','w')
for c in list_c:
arq_c.write(c)
I think the logic is right, if i have some items, the code is run fast. But i dont have 10 items, or 1.000, or even 100.000. I have 78.514.022
items in my list_b.txt
and 78.616.777
in my list list_a.txt
. I dont't know the cost of this expression: if a not in arq_b
. But if i execute this code, i think wont finish in this year.
My pc have 8GB, and i allocate 15gb for swap to not explode my RAM.
My question is, there's another way to make this operation more efficiently(Faster)?
- The
list_a
is ordinate but thelist_b
not. - Each item have this size:
images/00000cd9fc6ae2fe9ec4bbdb2bf27318f2babc00.png
- The order doesnt matter, i want know the surplus.
python python-3.x list performance
|
show 1 more comment
I have two lists in python list_a
and list_b
. The list_a
have some images links, and the list_b
too. 99% of the items are the same, but i have to know this 1%. The all surplus items are in list_a
, that means all items in list_b
are in list_a
. My initial idea is subtract all items:
list_a - list_b = list_c
, where the list_c
are my surplus items. My code is:
list_a =
list_b =
list_c =
arq_b = open('list_b.txt','r')
for b in arq_b:
list_b.append(b)
arq_a = open('list_a.txt','r')
for a in arq_a:
if a not in arq_b:
list_c.append(a)
arq_c = open('list_c.txt','w')
for c in list_c:
arq_c.write(c)
I think the logic is right, if i have some items, the code is run fast. But i dont have 10 items, or 1.000, or even 100.000. I have 78.514.022
items in my list_b.txt
and 78.616.777
in my list list_a.txt
. I dont't know the cost of this expression: if a not in arq_b
. But if i execute this code, i think wont finish in this year.
My pc have 8GB, and i allocate 15gb for swap to not explode my RAM.
My question is, there's another way to make this operation more efficiently(Faster)?
- The
list_a
is ordinate but thelist_b
not. - Each item have this size:
images/00000cd9fc6ae2fe9ec4bbdb2bf27318f2babc00.png
- The order doesnt matter, i want know the surplus.
python python-3.x list performance
5
Does the order matter? If not, try using sets. With sets, subtraction should be linear:set_c = set_a - set_b
.
– L3viathan
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
1
Yes, I mean the Python datatypeset
.
– L3viathan
2 days ago
1
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago
|
show 1 more comment
I have two lists in python list_a
and list_b
. The list_a
have some images links, and the list_b
too. 99% of the items are the same, but i have to know this 1%. The all surplus items are in list_a
, that means all items in list_b
are in list_a
. My initial idea is subtract all items:
list_a - list_b = list_c
, where the list_c
are my surplus items. My code is:
list_a =
list_b =
list_c =
arq_b = open('list_b.txt','r')
for b in arq_b:
list_b.append(b)
arq_a = open('list_a.txt','r')
for a in arq_a:
if a not in arq_b:
list_c.append(a)
arq_c = open('list_c.txt','w')
for c in list_c:
arq_c.write(c)
I think the logic is right, if i have some items, the code is run fast. But i dont have 10 items, or 1.000, or even 100.000. I have 78.514.022
items in my list_b.txt
and 78.616.777
in my list list_a.txt
. I dont't know the cost of this expression: if a not in arq_b
. But if i execute this code, i think wont finish in this year.
My pc have 8GB, and i allocate 15gb for swap to not explode my RAM.
My question is, there's another way to make this operation more efficiently(Faster)?
- The
list_a
is ordinate but thelist_b
not. - Each item have this size:
images/00000cd9fc6ae2fe9ec4bbdb2bf27318f2babc00.png
- The order doesnt matter, i want know the surplus.
python python-3.x list performance
I have two lists in python list_a
and list_b
. The list_a
have some images links, and the list_b
too. 99% of the items are the same, but i have to know this 1%. The all surplus items are in list_a
, that means all items in list_b
are in list_a
. My initial idea is subtract all items:
list_a - list_b = list_c
, where the list_c
are my surplus items. My code is:
list_a =
list_b =
list_c =
arq_b = open('list_b.txt','r')
for b in arq_b:
list_b.append(b)
arq_a = open('list_a.txt','r')
for a in arq_a:
if a not in arq_b:
list_c.append(a)
arq_c = open('list_c.txt','w')
for c in list_c:
arq_c.write(c)
I think the logic is right, if i have some items, the code is run fast. But i dont have 10 items, or 1.000, or even 100.000. I have 78.514.022
items in my list_b.txt
and 78.616.777
in my list list_a.txt
. I dont't know the cost of this expression: if a not in arq_b
. But if i execute this code, i think wont finish in this year.
My pc have 8GB, and i allocate 15gb for swap to not explode my RAM.
My question is, there's another way to make this operation more efficiently(Faster)?
- The
list_a
is ordinate but thelist_b
not. - Each item have this size:
images/00000cd9fc6ae2fe9ec4bbdb2bf27318f2babc00.png
- The order doesnt matter, i want know the surplus.
python python-3.x list performance
python python-3.x list performance
edited 2 days ago
Jean-François Fabre
101k954109
101k954109
asked 2 days ago
Vinicius MoraisVinicius Morais
1779
1779
5
Does the order matter? If not, try using sets. With sets, subtraction should be linear:set_c = set_a - set_b
.
– L3viathan
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
1
Yes, I mean the Python datatypeset
.
– L3viathan
2 days ago
1
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago
|
show 1 more comment
5
Does the order matter? If not, try using sets. With sets, subtraction should be linear:set_c = set_a - set_b
.
– L3viathan
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
1
Yes, I mean the Python datatypeset
.
– L3viathan
2 days ago
1
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago
5
5
Does the order matter? If not, try using sets. With sets, subtraction should be linear:
set_c = set_a - set_b
.– L3viathan
2 days ago
Does the order matter? If not, try using sets. With sets, subtraction should be linear:
set_c = set_a - set_b
.– L3viathan
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
1
1
Yes, I mean the Python datatype
set
.– L3viathan
2 days ago
Yes, I mean the Python datatype
set
.– L3viathan
2 days ago
1
1
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago
|
show 1 more comment
4 Answers
4
active
oldest
votes
you can create one set of the first file contents, then just use difference
or symmetric_difference
depending on what you call a difference
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
diffs = set_a.difference(f)
if list_b.txt
contains more items than list_a.txt
you want to swap them or use set_a.symmetric_difference(f)
instead, depending on what you need.
difference(f)
works but still has to construct a new set
internally. Not a great performance gain (see set issubset performance difference depending on the argument type), but it's shorter.
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
Well, not really, because internally aset
is created, then thrown away. but it's thrown away faster
– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
@L3viathan In case the original list (the original set) is not needed anymore you can usedifference_update
. This should not require to allocate a new set internally.
– a_guest
2 days ago
add a comment |
Try using sets:
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
set_b = set(f)
set_c = set_a - set_b
with open("list_c.txt","w") as f:
for c in set_c:
f.write(c)
The complexity of subtracting two sets is O(n) in the size of the set a.
2
You know - an open file is an iterator - therefore you can simply doset_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
yes but doingset(f)
in with block ensures that it closes the file
– Jean-François Fabre
2 days ago
add a comment |
To extend the comment of @L3viathan
If order of element is not important set is the rigth way.
here a dummy example you can adapt:
l1 = [0,1,2,3,4,5]
l2 = [3,4,5]
setL1 = set(l1) # transform the list into a set
setL2 = set(l2)
setDiff = setl1 - setl2 # make the difference
listeDiff = list(setDiff) # if you want to have your element back in a list
as you see is pretty straightforward in python.
add a comment |
In case order matters you can presort the lists together with item indices and then iterate over them together:
list_2 = sorted(list_2)
diff_idx =
j = 0
for i, x in sorted(enumerate(list_1), key=lambda x: x[1]):
if x != list_2[j]:
diff_idx.append(i)
else:
j += 1
diff = [list_1[i] for i in sorted(diff_idx)]
This has time complexity of the sorting algorithm, i.e. O(n*log n).
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54128876%2fwhat-is-the-most-efficient-way-to-compute-the-difference-of-lines-from-two-files%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
you can create one set of the first file contents, then just use difference
or symmetric_difference
depending on what you call a difference
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
diffs = set_a.difference(f)
if list_b.txt
contains more items than list_a.txt
you want to swap them or use set_a.symmetric_difference(f)
instead, depending on what you need.
difference(f)
works but still has to construct a new set
internally. Not a great performance gain (see set issubset performance difference depending on the argument type), but it's shorter.
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
Well, not really, because internally aset
is created, then thrown away. but it's thrown away faster
– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
@L3viathan In case the original list (the original set) is not needed anymore you can usedifference_update
. This should not require to allocate a new set internally.
– a_guest
2 days ago
add a comment |
you can create one set of the first file contents, then just use difference
or symmetric_difference
depending on what you call a difference
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
diffs = set_a.difference(f)
if list_b.txt
contains more items than list_a.txt
you want to swap them or use set_a.symmetric_difference(f)
instead, depending on what you need.
difference(f)
works but still has to construct a new set
internally. Not a great performance gain (see set issubset performance difference depending on the argument type), but it's shorter.
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
Well, not really, because internally aset
is created, then thrown away. but it's thrown away faster
– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
@L3viathan In case the original list (the original set) is not needed anymore you can usedifference_update
. This should not require to allocate a new set internally.
– a_guest
2 days ago
add a comment |
you can create one set of the first file contents, then just use difference
or symmetric_difference
depending on what you call a difference
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
diffs = set_a.difference(f)
if list_b.txt
contains more items than list_a.txt
you want to swap them or use set_a.symmetric_difference(f)
instead, depending on what you need.
difference(f)
works but still has to construct a new set
internally. Not a great performance gain (see set issubset performance difference depending on the argument type), but it's shorter.
you can create one set of the first file contents, then just use difference
or symmetric_difference
depending on what you call a difference
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
diffs = set_a.difference(f)
if list_b.txt
contains more items than list_a.txt
you want to swap them or use set_a.symmetric_difference(f)
instead, depending on what you need.
difference(f)
works but still has to construct a new set
internally. Not a great performance gain (see set issubset performance difference depending on the argument type), but it's shorter.
edited 2 days ago
answered 2 days ago
Jean-François FabreJean-François Fabre
101k954109
101k954109
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
Well, not really, because internally aset
is created, then thrown away. but it's thrown away faster
– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
@L3viathan In case the original list (the original set) is not needed anymore you can usedifference_update
. This should not require to allocate a new set internally.
– a_guest
2 days ago
add a comment |
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
Well, not really, because internally aset
is created, then thrown away. but it's thrown away faster
– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
@L3viathan In case the original list (the original set) is not needed anymore you can usedifference_update
. This should not require to allocate a new set internally.
– a_guest
2 days ago
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
Nice, this avoids having to allocate space for the second set.
– L3viathan
2 days ago
1
1
Well, not really, because internally a
set
is created, then thrown away. but it's thrown away faster– Jean-François Fabre
2 days ago
Well, not really, because internally a
set
is created, then thrown away. but it's thrown away faster– Jean-François Fabre
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
But the complexity is the same of subtract sets?
– Vinicius Morais
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
@ViniciusMorais The time complexity is the same, the space complexity (apparently), too.
– L3viathan
2 days ago
1
1
@L3viathan In case the original list (the original set) is not needed anymore you can use
difference_update
. This should not require to allocate a new set internally.– a_guest
2 days ago
@L3viathan In case the original list (the original set) is not needed anymore you can use
difference_update
. This should not require to allocate a new set internally.– a_guest
2 days ago
add a comment |
Try using sets:
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
set_b = set(f)
set_c = set_a - set_b
with open("list_c.txt","w") as f:
for c in set_c:
f.write(c)
The complexity of subtracting two sets is O(n) in the size of the set a.
2
You know - an open file is an iterator - therefore you can simply doset_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
yes but doingset(f)
in with block ensures that it closes the file
– Jean-François Fabre
2 days ago
add a comment |
Try using sets:
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
set_b = set(f)
set_c = set_a - set_b
with open("list_c.txt","w") as f:
for c in set_c:
f.write(c)
The complexity of subtracting two sets is O(n) in the size of the set a.
2
You know - an open file is an iterator - therefore you can simply doset_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
yes but doingset(f)
in with block ensures that it closes the file
– Jean-François Fabre
2 days ago
add a comment |
Try using sets:
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
set_b = set(f)
set_c = set_a - set_b
with open("list_c.txt","w") as f:
for c in set_c:
f.write(c)
The complexity of subtracting two sets is O(n) in the size of the set a.
Try using sets:
with open("list_a.txt") as f:
set_a = set(f)
with open("list_b.txt") as f:
set_b = set(f)
set_c = set_a - set_b
with open("list_c.txt","w") as f:
for c in set_c:
f.write(c)
The complexity of subtracting two sets is O(n) in the size of the set a.
edited 2 days ago
answered 2 days ago
L3viathanL3viathan
15.7k12847
15.7k12847
2
You know - an open file is an iterator - therefore you can simply doset_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
yes but doingset(f)
in with block ensures that it closes the file
– Jean-François Fabre
2 days ago
add a comment |
2
You know - an open file is an iterator - therefore you can simply doset_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
yes but doingset(f)
in with block ensures that it closes the file
– Jean-François Fabre
2 days ago
2
2
You know - an open file is an iterator - therefore you can simply do
set_a = set(open("list_a.txt"))
– jsbueno
2 days ago
You know - an open file is an iterator - therefore you can simply do
set_a = set(open("list_a.txt"))
– jsbueno
2 days ago
11
11
yes but doing
set(f)
in with block ensures that it closes the file– Jean-François Fabre
2 days ago
yes but doing
set(f)
in with block ensures that it closes the file– Jean-François Fabre
2 days ago
add a comment |
To extend the comment of @L3viathan
If order of element is not important set is the rigth way.
here a dummy example you can adapt:
l1 = [0,1,2,3,4,5]
l2 = [3,4,5]
setL1 = set(l1) # transform the list into a set
setL2 = set(l2)
setDiff = setl1 - setl2 # make the difference
listeDiff = list(setDiff) # if you want to have your element back in a list
as you see is pretty straightforward in python.
add a comment |
To extend the comment of @L3viathan
If order of element is not important set is the rigth way.
here a dummy example you can adapt:
l1 = [0,1,2,3,4,5]
l2 = [3,4,5]
setL1 = set(l1) # transform the list into a set
setL2 = set(l2)
setDiff = setl1 - setl2 # make the difference
listeDiff = list(setDiff) # if you want to have your element back in a list
as you see is pretty straightforward in python.
add a comment |
To extend the comment of @L3viathan
If order of element is not important set is the rigth way.
here a dummy example you can adapt:
l1 = [0,1,2,3,4,5]
l2 = [3,4,5]
setL1 = set(l1) # transform the list into a set
setL2 = set(l2)
setDiff = setl1 - setl2 # make the difference
listeDiff = list(setDiff) # if you want to have your element back in a list
as you see is pretty straightforward in python.
To extend the comment of @L3viathan
If order of element is not important set is the rigth way.
here a dummy example you can adapt:
l1 = [0,1,2,3,4,5]
l2 = [3,4,5]
setL1 = set(l1) # transform the list into a set
setL2 = set(l2)
setDiff = setl1 - setl2 # make the difference
listeDiff = list(setDiff) # if you want to have your element back in a list
as you see is pretty straightforward in python.
answered 2 days ago
RomainL.RomainL.
308313
308313
add a comment |
add a comment |
In case order matters you can presort the lists together with item indices and then iterate over them together:
list_2 = sorted(list_2)
diff_idx =
j = 0
for i, x in sorted(enumerate(list_1), key=lambda x: x[1]):
if x != list_2[j]:
diff_idx.append(i)
else:
j += 1
diff = [list_1[i] for i in sorted(diff_idx)]
This has time complexity of the sorting algorithm, i.e. O(n*log n).
add a comment |
In case order matters you can presort the lists together with item indices and then iterate over them together:
list_2 = sorted(list_2)
diff_idx =
j = 0
for i, x in sorted(enumerate(list_1), key=lambda x: x[1]):
if x != list_2[j]:
diff_idx.append(i)
else:
j += 1
diff = [list_1[i] for i in sorted(diff_idx)]
This has time complexity of the sorting algorithm, i.e. O(n*log n).
add a comment |
In case order matters you can presort the lists together with item indices and then iterate over them together:
list_2 = sorted(list_2)
diff_idx =
j = 0
for i, x in sorted(enumerate(list_1), key=lambda x: x[1]):
if x != list_2[j]:
diff_idx.append(i)
else:
j += 1
diff = [list_1[i] for i in sorted(diff_idx)]
This has time complexity of the sorting algorithm, i.e. O(n*log n).
In case order matters you can presort the lists together with item indices and then iterate over them together:
list_2 = sorted(list_2)
diff_idx =
j = 0
for i, x in sorted(enumerate(list_1), key=lambda x: x[1]):
if x != list_2[j]:
diff_idx.append(i)
else:
j += 1
diff = [list_1[i] for i in sorted(diff_idx)]
This has time complexity of the sorting algorithm, i.e. O(n*log n).
edited 2 days ago
answered 2 days ago
a_guesta_guest
5,60821241
5,60821241
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f54128876%2fwhat-is-the-most-efficient-way-to-compute-the-difference-of-lines-from-two-files%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
5
Does the order matter? If not, try using sets. With sets, subtraction should be linear:
set_c = set_a - set_b
.– L3viathan
2 days ago
But is possible make this in python?
– Vinicius Morais
2 days ago
The python will use the most efficient way to make this operation?
– Vinicius Morais
2 days ago
1
Yes, I mean the Python datatype
set
.– L3viathan
2 days ago
1
@tripleee It's not a duplicate of that - that question is about mapping subtraction over a list, this question is about the difference between what's included in the lists.
– SpoonMeiser
2 days ago