[JAVA] Minecraft 1.12.2 Back calculation of world seeds from world map

environment

background

What happens if the seed value is disclosed in Minecraft multiplayer? You can do the following with a simple idea.

--List the coordinates of all diamond ores near the quarry --Identify the coordinates and chest contents of all villages and Nether fortresses around the world --Identification of rare biomes --Identification of the shortest acquisition route for enchantment gold apples --Creative experiments using world clones --Terrain survey by using mods not authorized by the server --Search for a specific type of spawner --If MOD is included, search for rare structures derived from MOD --Search for rare biomes

Since it is not necessary to put MOD in the client, even if the client version, MOD, and non-modification are specified at the server rule level, no countermeasure can be taken.

Client MOD restrictions ・ If a specific player knows the seed on a server that does not disclose the seed, and that player uses the seed, even specifying the coordinates of the diamond ore will cause considerable balance destruction.

So I wondered if a multi-player non-saba can player could find a seed using only the information available by legitimate means without cracking the server. If this is possible, no matter how much the server administrator keeps the seed private, it can be physically known, so ** behavior monitoring and local rules must be taken separately **.

rule

Know the seed of the world when you have a Minecraft 1.12.2 Java Edition server that produces the same terrain as vanilla with some Forge mods.

conditions

Premise

--The world terrain generation config has not been tampered with from the default --Seed value is random in the long range --The calculation should take at most a few days ――It must be possible to calculate with a general personal computer

Ban

--You must only connect to the world by legitimate means --MODs that are not allowed by server rules cannot be used --Do not connect to the world using modified clients or mods --The server should only be accessed from Minecraft

Possible

--You can freely explore the surface of the Overworld --Coordinate display by F3 can be used --Biome display by F3 can be used ――You can freely record the information you see with screenshots etc.

strategy

The underlying strategy is to use "test for all possible seed candidates to see if a seed produces a particular terrain". However, as it is, 2 ^ 64 seeds can be considered, so even if the judgment part is considerably speeded up, it will take about 10,000 years on an ordinary personal computer. It seems that it will end soon if you use a supercomputer, but it is hard to say that it is possible to know it.

"Try all possible seed candidates to see if a seed produces a particular terrain" is written in pseudocode as follows:

for (long seed = 0; seed != 0xFFFFFFFFFFFFFFFFL; seed++) { // (1)
    if (f(seed)) System.out.println(seed);
}

/**
 *A function that returns whether or not a seed creates that world.
 */
abstract boolean f(long seed); // (2)

Here, the points to be improved are (1) and (2). At present, the number of loops for (1) is too large and it takes about 10,000 years, so it is necessary to use a loop method that can save more calculations. (2) must be narrowed down by the simplest possible formula.

(2) can also be multi-stage.

for (long seed = 0; seed != 0xFFFFFFFFFFFFFFFFL; seed++) {
    if (f1(seed) && f2(seed) && f3(seed)) System.out.println(seed);
}

/**
 *A fast function that roughly narrows down candidates
 */
abstract boolean f1(long seed);

/**
 *Medium-speed function that narrows down the candidates to a few
 */
abstract boolean f2(long seed);

/**
 *A slow function that narrows down the candidates to one
 */
abstract boolean f3(long seed);

This time, we will identify the seed with a strategy like this.

f1 High-speed function that roughly narrows down candidates

Since f1 is required to be fast anyway, I will make it a function that can be handled by GPGPU. In order to do so, some of the following conditions must be met.

--Do not use a running instance of Minecraft --Independent of JRE library

For example, x-> x + 700 == 4 is simple, so it's okay. x-> Math.sin (x * 0.001) == 1 needs to do something about Math.sin. x-> world.getBiome (new BlockPos (x, 65, 0)) == Biome.FOREST is not possible.

Aparapi is used for GPGPU.

--I tried GPU operation (Aparapi) in Java --Ka-ka_xyz's diary http://ka-ka-xyz.hatenablog.com/entry/20180129/1517152510

I used this for the GPU.

--A true low-end "Geforce GT 710" review. Benchmark game performance of ultra-power-saving graphics card that can be bought for 3,000 yen http://androgamer.net/2017/10/08/post-6852/

There is room for more savings with a better GPU.


Here, the generation position of the village is used for f1.

--Minecraft 1.12.2 Algorithm of the coordinate determination part of the village --Qiita https://qiita.com/MirrgieRiana/items/0a3baee86bb661dc5f99

Using the village-generated coordinates, you can roughly narrow down the candidates by the following procedure.

--Discover about 8 villages --Make a note of the chunk coordinates where the village is generated --List only the seeds for which villages are generated at those eight coordinates

Whether or not a world seed creates a village at a certain chunk coordinate is calculated by the following function test.

int distance = 32;

boolean test(long seed, int chunkX, int chunkZ)
{
	seed += (long) getRegionIndex(chunkX) * 341873128712L + (long) getRegionIndex(chunkZ) * 132897987541L + 10387312L;
	seed = seed ^ 0x5DEECE66DL;
	seed = next(seed);
	if (!test2(seed, chunkX)) return false;
	seed = next(seed);
	if (!test2(seed, chunkZ)) return false;
	return true;
}

boolean test2(long s, int chunkIndex)
{
	return (s >> 17) % 24 == getChunkIndexInRegion(chunkIndex);
}

int getRegionIndex(int chunkIndex)
{
	if (chunkIndex < 0) chunkIndex -= 31;
	return chunkIndex / 32;
}

int getChunkIndexInRegion(int chunkIndex)
{
	return chunkIndex - getRegionIndex(chunkIndex) * 32;
}

long next(long s)
{
	return (s * 0x5DEECE66DL + 0xBL) & 0xffffffffffffL;
}

long prev(long s)
{
	return ((s - 0xBL) * 0xdfe05bcb1365L) & 0xffffffffffffL;
}

This code extends inside Random. See below for Random internal processing.

--Back calculation of the transition of internal seed of Random --Qiita https://qiita.com/MirrgieRiana/items/1833ec6ab9d90341760a#%E5%95%8F%E9%A1%8C

This function is available from Aparapi as it is independent of JRE or Minecraft instance.

In the following, we will assume that the distance is fixed at 32, which is the default. Then, there are 24 different coordinates for generating villages in the area.


By calling the above function for the number of villages investigated, the seeds can be roughly narrowed down.

However, due to the truncation of the upper 16 bits of Random, this function cannot, in principle, narrow down the number of seed candidates to one, no matter how much the village is investigated. So, I will narrow down to about 1000 out of 2 ^ 48 ways as follows, and then narrow down the candidates of about 65536 * 1000 again.

for (long seed = 0; seed != 0xFFFFFFFFFFFFL; seed++) {
    if (!test(seed, -16, -18)) continue; //Village placement of seed 1251341
    if (!test(seed, 12, -49)) continue;
    if (!test(seed, -77, 19)) continue;
    if (!test(seed, 21, -80)) continue;
    System.out.println(seed);
}

If 2 ^ 64 ways take 10,000 years, a simple calculation of 2 ^ 48 ≒ 2.8E + 14 ways will take about 50 days. Subsequent narrowing down from 65536000 ≒ 6.5E + 7 ways is far less than that.

(1) Optimization of loop part

We were able to reduce the number of calculations to about 50 days, but it is still hard to say that it is possible to calculate. However, there is actually a simple way to reduce this to 1/24. To do this, use the village generation algorithm and Random specifications listed above. The generated coordinates of a village are 1 / (24 * 24) narrowing performance for one village, and 1/24 for only one side of the coordinates. The code below is a compilation of test and test2. Here, the seed at the time of (3) can be predicted to some extent.

boolean test(long seed, int chunkX, int chunkZ)
{
	seed += (long) getRegionIndex(chunkX) * 341873128712L + (long) getRegionIndex(chunkZ) * 132897987541L + 10387312L;
	seed = seed ^ 0x5DEECE66DL;
	seed = next(seed);
	// (3)
	if (!((seed >> 17) % 24 == getChunkIndexInRegion(chunkX))) return false; // (4)
	seed = next(seed);
	if (!((seed >> 17) % 24 == getChunkIndexInRegion(chunkZ))) return false;
	return true;
}

If the getChunkIndexInRegion (chunkX) part was found to be 20, the judgment formula in (4) would be (seed >> 17)% 24 == 20. Shift right 17 and loop for a number whose value divided by 24 is 20.

The bit structure of the seed is as follows.

-------- -------- 00000000 00000000 00000000 0000000? ???????? ????????

Here, - is the part that is ignored by Random's specifications (it does not affect the placement of the village), and ? Is the part that is discarded by the right 17 shift (it affects the placement of the village, but the chunk X of the first village). Does not affect). To fix the chunk X of the first village, the remainder of the remaining 0 divided by 24 should be a specific value. Also, the ? Part can be anything, so loop inside so that there are 17 bits of freedom. The coded version is as follows.

int chunkX1 = 20;
int chunkZ1 = 20;

for (long j = getChunkIndexInRegion(chunkX1); j <= 0x7FFFFFFFL; j += 24) {
    for (long i = 0; i <= 0x1FFL; i++) {
        long s = (j << 17) | i; // (5)
        
    }
}

** Addendum: for (long i = 0; i <= 0x1FFL; i ++) { is a mystery. There is a theory that for (long i = 0; i <= 0x1FFFFL; i ++) { is correct. ** **

chunkX1 and chunkZ1 are the chunk coordinates of the first village in the village coordinate list.

However, the s in (5) represents the seed at the time of (3), so you have to calculate a little to get the seed of the world itself. To change s to seed:

long seed = s;
seed = prev(seed);
seed = seed ^ 0x5DEECE66DL;
seed -= (long) getRegionIndex(chunkX1) * 341873128712L + (long) getRegionIndex(chunkZ1) * 132897987541L + 10387312L;
seed = seed & 0xFFFFFFFFFFFFL;

Now we can loop the seed of the world where the village is generated at some X coordinate in a region. This brings the number of loops to 2 ^ 48/24. The calculation takes about two days, so it can be said that it is realistic enough. In the experiment, the coordinates of eight villages were investigated, the calculation was completed in about 27 hours using a (poor performance) GPGPU, and the lower 48-bit patterns could be narrowed down to about 250 candidates.

f2 Medium-speed function that narrows down the candidates to a few

Previous knowledge

In this function, f1 specifies up to one candidate narrowed down to about 1000 in the lower 48 bits.

In the experiment, 8 villages were surveyed, so there should be 24 ^ (2 * 8) narrowing performance (73 bits), but in reality, about 250 candidates were born. Therefore, it is doubtful that f1 can identify up to one. It is considered that many low-order 48-bit patterns correspond to one arrangement.

f2 is performed to identify the pattern of the lower 48 bits to one. f2 seems to have that much ability. If you spend hundreds of times more time with f3, you don't need f2, but f3 is quite heavy, so I want to save it.

strategy

We will use villages here as well, but this time we will focus on structure rather than placement.

--Minecraft 1.12.2 Algorithm of the coordinate determination part of the village --Qiita https://qiita.com/MirrgieRiana/items/0a3baee86bb661dc5f99 --Minecraft 1.12.2 Village Building List --Qiita https://qiita.com/MirrgieRiana/items/b5d56b4d78c503ddb7a0

The village is composed of parts other than buildings such as wells, roads and streets, and building parts such as houses, fields and churches. Here, we pay attention to the number of buildings that are easy to understand visually.

For example, the following village is composed of the following buildings.

2018-09-22_22.03.33.png

--House1 3 bookshelf houses --House2 2 blacksmiths --House3 1 dance hall --House4Garden Cube 6 pieces --WoodHut 5 toilets --Hall 3 garden houses --Church 2 churches --Field1 2 double fields --Field2, 4 single fields

This is shown by a hash value of 0x422356123L, arranged from the bottom, for example. Expressing the structure of the village as a hash value based on the number of buildings, the problem is simple because == can be used to judge whether the shapes are almost the same. In addition, it is easy to collect information because it can be easily calculated from the information that can be visually confirmed by expressing the number of buildings in the village. Even with this, it has the ability to narrow down the pattern of the lower 48 bits of the seed from 1000 to 1 in a survey of 1 or 2 villages.

Since the generated chunk coordinates of this village are -16, -18, this is called "* Seed 1251341 generates a village of 0x422356123L at chunk coordinates -16, -18 *".


The random numbers that determine the structure of the village are generated by MapGenBase # generate, slightly modified by MapGenStructure # recursiveGenerate, and used in MapGenVillage.Start # new. If this is made into a code, it will look like this.

World worldIn =World instance given from somewhere;
int size = 1;

Random rand = new Random(getSeedInChunk(1251341, -16, -18));
rand.nextInt();
new MapGenVillage.Start(worldIn, rand, -16, -18, size);

/**
 *A function that returns a seed for a village structure given a world seed and village chunk coordinates
 */
long getSeedInChunk(long seed, long chunkX, long chunkZ)
{
    Random rand = new Random(seed);
    long j = rand.nextLong();
    long k = rand.nextLong();
    (chunkX * j) ^ (chunkZ * k) ^ seed;
}

MapGenVillage.Start # new also takes World, base chunk coordinates and village size. The village size is 1 by default.

The structure of the village depends on the seed of the world and the chunk coordinates generated. It looks like there is no pattern loop unlike the Nether Fortress.

--Minecraft 1.12.2 Algorithm for determining spawn coordinates for Nether Fortress --Qiita https://qiita.com/MirrgieRiana/items/e01a3db6fd7bf183fbbb

The hash value of the village structure is calculated by simulating the inside of MapGenVillage.Start # new, but this constructor is roughly as follows.

List<structurecomponent> components = new ArrayList<>();

MapGenVillage.Start(World worldIn, Random rand, int x, int z, int size)
{
    List<PieceWeight> list =Method to get a list of building types, weights and upper limit pairs();
    StructureVillagePieces.Start start = new StructureVillagePieces.Start(
        worldIn.getBiomeProvider(), 0, rand, (x << 4) + 2, (z << 4) + 2, list, size);
    components.add(start); // (7)
    start.buildComponent(start, components, rand); // (8)
    List<StructureComponent> list1 = start.pendingRoads;
    List<StructureComponent> list2 = start.pendingHouses;

    while (!list1.isEmpty() || !list2.isEmpty()) { // (9)
        if (list1.isEmpty()) {
            int i = rand.nextInt(list2.size());
            StructureComponent structurecomponent = list2.remove(i);
            structurecomponent.buildComponent(start, components, rand);
        } else {
            int j = rand.nextInt(list1.size());
            StructureComponent structurecomponent2 = list1.remove(j);
            structurecomponent2.buildComponent(start, components, rand);
        }
    }

    // (6)
}

At the time of (6), components contains a list of parts that make up the village. StructureVillagePieces.Start # new seems to represent the first well, which was added to components in (7). In (8), it seems that buildComponent is generating four paths derived from the well. Then, while deriving the structure generated in (9) one after another, it plunges into components. It will be completed when it finally comes to (6). Actually, the processing continues after this, but it is not used for the calculation of the hash value, so ignore it.

Around (6), if you search the contents of components, count the buildings and write a code to create a hash value and return, you will have a function that returns the structure of the village generated there from Random and chunk coordinates. .. However, this function cannot be placed in main because it depends on World that requires Minecraft initialization processing. This function can be called from code in Minecraft available to World. For example, you can write code that calls this function by overwriting the right-click process of the arrow.

From the discussion so far, we have created a "function that returns the hash value of the village structure given the seed and chunk coordinates". Rewriting this to the condition "Does a seed generate a village with a hash value at a certain chunk coordinate?" F2 is complete. As a result, the seed reaches the point where the upper 16 bits are unknown and the lower 48 bits are confirmed. This can be done by surveying at most one or two villages. The calculation time is a few seconds in a single thread.

f3 Slow function that narrows down the candidates to one

Here, the seed candidates are narrowed down from 65536 patterns for the upper 16 bits to one pattern. Use the biome for that. The conditional expression is "Does the seed generate a specific biome at a specific XZ coordinate?"

The village ignores the top 16 bits, but the biome seems to see it. I don't know the biome determination algorithm, but it is enough to know the biome of the coordinates with getBiome provided by BiomeProvider derived from WorldInfo obtained from World.

The problem here is how to pass the seed to BiomeProvider, by rewriting the randomSeed field that WorldInfo passed to the constructor of BiomeProvider is privately stored. However, since randomSeed is private, it needs to be rewritten to public.

This process also requires World. When I initialized Minecraft by myself and generated World without starting Minecraft correctly, it seemed that a slightly different terrain was generated and it was not successful. So, this function also modifies the right-click processing of the clock item or something so that it is called in it.

The list of coordinates and biomes to pass through this process must be carefully crafted. Since the survey of the terrain is human-powered, I am still worried, so I would like to take the coordinates in the middle of the biome if possible. Instead of trying to match all 10 if 10 more are passed, I would like to proceed with the work while performing sequential confirmation such as listing and sorting 8 or more matches.

In this process, we succeeded in identifying from 65536 patterns to 1 pattern by walking 10 biomes, writing down the coordinates near the center, and pouring them into the function. It takes a few minutes to specify.


Throughout, we were able to identify a single seed in the 64-bit range using the coordinates of eight villages, the structural hash value of one village, and the ten biomes. This took about 27 hours in my environment.

Summary

** Minecraft 1.12.2 Java Edition vanilla's default terrain generation world can gather enough information to back-calculate seeds just by walking around the surface. You can then use that information to back-calculate the seed in at most two days. ** Furthermore, if the world map and the position of the origin are given without actually logging in, the information will be gathered by itself.

This time, it took 2 days to calculate because it was assumed that the seed was in the long range (2 ^ 64), but if the seed was set to int, it would take about 4 seconds to search if the coordinates of 4 villages were known. Is finished. ʻObject # hashCode () `is the return value of int, so if you give a seed from a string, it will be like this.

It is effective to change the terrain generation parameters a little for this attack. However, since the seed of the world affects various places, it will still be possible to calculate backwards using the parts other than those used this time. The decisive factor this time is the fact that the upper 16 bits of the seed value given by java.util.Random are truncated. The moment you put the seed value in Random, there are only 2 ^ 48 patterns in the world. At this point, if you already have 50 days, you can do a full search, so if you calculate with multiple computers, it will be easy.

At present, ** in a world that already meets the conditions for back calculation, no matter how much attention is paid to server security, the seed is theoretically sloppy **.

Recommended Posts

Minecraft 1.12.2 Back calculation of world seeds from world map
Back calculation of the transition of the internal seed of Random