Deep copy / shallow copy ...
A word that often causes confusion.
Here, I would like to see how the program behaves differently for each copy method, using the smallest example in which the difference between each copy method is apparent.
It is a disclaimer. If you don't mind, skip to the "What are you doing?" Chapter.
In this article, we will use "shared passing" to unify the names of the methods called "pass-by-sharing", "call-by-sharing", "shared passing", and "reference passing by value".
Please note that the author does not recommend this term. For convenience of comparing multi-language specifications, we've just given the names used in this article.
In this article, the terms "pass by reference" and "pass by share" are also used to describe "methods for variable assignment."
Normally, it is a word used when passing arguments to a function, but since the same concept holds true for assignments, we will use the word as it is.
(This is my personal opinion, but I'm not sure about the merit of using the word "passing XX" only when passing arguments to a function. I think it's okay to use it for assignment, so I'll do that in this article. I will.)
When I search, I get a lot of articles saying "Shallow copy is a method of copying only a reference or pointer and not creating a new entity".
In this article, the case where "a new entity is created, but somewhere in its contents refers to the same entity as before copying" is called "shallow copy".
English Wikipedia and Official Python documentation But that's the case.
(This is my personal opinion, but at least this case of "creating a new entity but referring to the copy source" should be called "shallow copy", so "shallow copy is new". I think the explanation "things that do not create an entity" is incorrect. There is no contradiction if "things that do not create a new entity ** are also called ** shallow copies", but I think that is also an error. )
Substitute a "copy in some sense" of the variable ʻa to the variable
b, play with the
b, and then check the contents of the ʻa
.
At that time, not only two types of copying methods
--By reference assignment --Shared assignment --Substitute a shallow copy --Substitute deep copy
Let's compare the four.
The meaning of each term will be explained together when the code is explained below.
assignment.dart
void main() {
var a = [
["did deep copy"]
];
//Here, the process of substituting a copy of a in some sense for b
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [
["did pass-by-reference"]
];
print(a[0][0]);
}
I just wanted to introduce the processing content, and I didn't want to talk about it depending on a specific language, so I wrote it in Dart, which seems to be used by few people. (Recently increased ...?)
The processing content is
in some sense for the variable
b`.b
.b
.b
.Let's see why this makes a difference.
If the assignment b = a
is passed by reference, then b
will have a "reference" that points to ʻa, and all subsequent processing on
b will also affect ʻa
. I will.
It is easier to understand the movement if you think of ʻaas being given the alias
b`.
In this case, the behavior of the code above is as follows.
assignment.dart
void main() {
var a = [
["did deep copy"]
];
//Now pass a by reference to b
//All of the following processing is equivalent to that performed for a
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [
["did pass-by-reference"]
];
print(a[0][0]); // did pass-by-reference
}
Last
b = [
["did pass-by-reference"]
];
The content of ʻa` is a list of lists with this string.
Therefore, the output will be did pass-by-reference
.
By the way, this string means "passed by reference".
Strictly speaking, shared delivery is not just a term that refers to how to pass.
In a shared language, the content of a variable is not the value itself, but a reference to that value. (This is for many languages such as Java, Python, JavaScript.)
Then, when b = a
is set, the" reference "stored in ʻa is copied and stored in
b` as well.
This is why shared passing is sometimes referred to as "passing by reference".
In this case, the above code behaves as follows.
assignment.dart
void main() {
//a has a reference to this double array
var a = [
["did deep copy"]
];
//Here, share a to b
//This process affects a because a and b have references that point to the same entity.
b[0][0] = "did shallow copy";
//This process also affects a
b[0] = ["did pass-by-sharing"];
//Here, b stores a new reference pointing to the new entity, so this process does not affect a.
b = [
["did pass-by-reference"]
];
print(a[0][0]); // did pass-by-sharing
}
Last
b = [
["did pass-by-reference"]
];
Has assigned " [[" did pass-by-reference "]]
, a reference that points to a different entity, to b
, so this process affects ʻa`. No.
Therefore, the output will be did pass-by-sharing
.
By the way, this string means "shared and passed".
We continue to talk about languages where the content of a variable is not the value itself, but a reference to that value. (Java, Python, JavaScript, etc.)
In languages where the content of the variable is the value itself (such as C ++), there is no shallow copy. (I think it can be reproduced using pointers and passing by reference)
Shallow copy, as the name implies, makes a copy.
If the content is "reference", copy it as it is.
Let's see what that means in the code.
assignment.dart
void main() {
//a has a reference to this double array
var a = [
["did deep copy"]
];
//Now make a shallow copy of a and assign it to b
//a and b have references that point to different entities.
//However, both entities have the same "reference" in their 0th element, so
//This process affects a
b[0][0] = "did shallow copy";
//The 0th element of b is rewritten as a reference that points to something different.
//This process does not affect a.
b[0] = ["did pass-by-sharing"];
//Here, b stores a new reference pointing to the new entity, so this process also does not affect a.
b = [
["did pass-by-reference"]
];
print(a[0][0]); // did shallow copy
}
ʻA and its shallow copy are different in substance. If you output ʻid
in Python and hashcode
in Dart and check it, you can see that ʻaand
b` point to different entities.
However, the same contents are copied.
For that matter, a "reference" pointing to the same thing has been copied.
So, if you rewrite the contents of the "reference destination of contents", both ʻaand
b` will be affected.
It is this.
b[0][0] = "did shallow copy";
On the other hand, if you rewrite the "contents of b
"itself, it will not affect ʻa`.
that is
b[0] = ["did pass-by-sharing"];
is. This process does not affect ʻa. The result is a
did shallow copy`.
Like shallow copy, deep copy makes a copy, but
Examine the referenced value in the contents and copy it as well.
If it is also a "reference", check the value of the reference and copy it as well.
Repeat this until you have copied all the references.
ʻA and
b` no longer share anything.
Operations performed on one do not affect the other at all.
assignment.dart
void main() {
//a has a reference to this double array
var a = [
["did deep copy"]
];
//Now pass a deep copy of a to b. After that, the operation to b does not affect a.
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [
["did pass-by-reference"]
];
print(a[0][0]); // did deep copy
}
Since nothing has changed in ʻa, the initial value
did deep copy` is output.
Now let's see how they are actually assigned in some languages! !! !!
JavaScript
JavaScript will pass shared when assigned normally.
assignment.js
a = [["did deep copy"]];
b = a;
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [["did pass-by-reference"]];
console.log(a[0][0]); // did pass-by-sharing
If you don't want to share the entity, if it's an array, you can use slice
to make a copy and have it.
assignment.js
a = [["did deep copy"]];
b = a.slice(0, a.length);
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [["did pass-by-reference"]];
console.log(a[0][0]); // did shallow copy
The copy in this case is a shallow copy. If you want to make a deep copy, you need to devise.
Even if it is an object instead of an array, it is shared by assigning it normally. If you want to copy it, you can do as follows.
assignment.js
a = { x: { y: "did deep copy" } };
b = Object.assign({}, a); //Here b=If a is set, it will be shared
b.x.y = "did shallow copy";
b.x = { y: "did pass-by-sharing" };
b = { x: { y: "did pass-by-reference" } };
console.log(a.x.y); // did shallow copy
Python
Next is Python.
assignment.py
import copy
a = [['did deep copy']]
b = a
b[0][0] = 'did shallow copy'
b[0] = ['did pass-by-sharing']
b = [['did pass-by-reference']]
print(a[0][0]) # did pass-by-sharing
Even in Python, if you assign it normally, it will be shared.
Python has a module called copy
that allows you to explicitly make shallow and deep copies.
Click here for documentation copy --- shallow copy and deep copy operations
You can make a shallow copy with copy.copy
.
assignment.py
import copy
a = [['did deep copy']]
b = copy.copy(a)
b[0][0] = 'did shallow copy'
b[0] = ['did pass-by-sharing']
b = [['did pass-by-reference']]
print(a[0][0]) # did shallow copy
You can make a deep copy with copy.deepcopy
.
assignment.py
import copy
a = [['did deep copy']]
b = copy.deepcopy(a)
b[0][0] = 'did shallow copy'
b[0] = ['did pass-by-sharing']
b = [['did pass-by-reference']]
print(a[0][0]) # did deep copy
Objects can be copied in this module as well. It's convenient.
Dart
assignment.dart
void main() {
var a = [
["did deep copy"]
];
var b = a;
b[0][0] = "did shallow copy";
b[0] = ["did pass-by-sharing"];
b = [
["did pass-by-reference"]
];
print(a[0][0]); // did pass-by-sharing
}
If you substitute Dart normally, it will be shared.
C++
C ++ behaves very differently from the languages so far.
C ++ is not shared when assigned normally.
In C ++, the content of a variable is not a "reference" but a "value itself".
Since it is copied as it is, there is no relationship with the copy source at that point.
No matter how you copy and create it, it doesn't affect the copy source.
It behaves exactly like ** deep copy **.
assignment.cpp
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<vector<string>> a{vector<string>{"did deep copy"}};
vector<vector<string>> b = a;
b[0][0] = "did shallow copy";
b[0] = vector<string>{"did pass-by-sharing"};
b = vector<vector<string>>{vector<string>{"did pass-by-reference"}};
cout << a[0][0] << endl; // did deep copy
}
You can also pass by reference in C ++.
In the case of passing by reference, all operations performed by copying and creating will affect the copy source.
assignment.cpp
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector<vector<string>> a{vector<string>{"did deep copy"}};
vector<vector<string>> &b = a;
b[0][0] = "did shallow copy";
b[0] = vector<string>{"did pass-by-sharing"};
b = vector<vector<string>>{vector<string>{"did pass-by-reference"}};
cout << a[0][0] << endl; // did pass-by-reference
}
As we have seen so far, if you write this process, you can clarify what is being assigned at the time of assignment.
If you know that, you should be less likely to be bothered by unexpected behavior.
Also, although it is a similar article, there is an article that promotes understanding by comparing the behavior when passing arguments to the function and the behavior when assigning, so please read it if you like.
Understand passing by value / shared / by reference with the smallest example (5 lines)
If you make a mistake in this article, I would be very grateful if you could point it out! Thank you.
Recommended Posts