It is near, Kon'nichiwa! This time, I will introduce how to get the internal representation of floating point numbers.
Floating-point numbers are numbers that represent decimal numbers in the sign, exponent, and mantissa. In many programming languages, the standard called IEEE 754 has been adopted, and the internal representation is arranged in the order of sign part, exponent part, and mantissa part from the top.
Here, we will create some programs to get the internal representation of a single precision floating point type (float type). The languages are C, C ++, C #, VB.NET and Java.
In C language and C ++, a ** union ** consisting of one ** float type variable ** and one ** int type variable ** with the same data size as the float type is used. Since each member of the union is in the same memory area, you can get an internal representation of a float type value from an int type member by putting a real number in the float type member.
float-bin.c
#include <stdio.h>
int main( void ) {
/*Since the data size of the float type is 32 bits, it is combined with the 32-bit integer type (int).*/
union { float f; int i; } a;
int i;
a.f = -2.5f;
printf( "%f ( %08X )\n", a.f, a.i );
/*Show a column of bits*/
for( i = 31; i >= 0; i-- ){
printf( "%d", ( a.i >> i ) & 1 );
}
printf( "\n" );
/*Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).*/
printf( "Sign part: %X\n", ( a.i >> 31 ) & 1 );
printf( "Index part: %X\n", ( a.i >> 23 ) & 0xFF );
printf( "Mantissa: %X\n", a.i & 0x7FFFFF );
return 0;
}
** Execution result ** -2.500000 ( C0200000 ) 11000000001000000000000000000000 Sign: 1 Index part: 80 Mantissa: 200000
C # and VB.NET use the ** BitConverter class ** of the .NET Framework. You can use the ** BitConverter.GetBytes ** method to convert a float type value to a byte type array once, and then use the ** BitConverter.ToInt32 ** method to get the internal representation of the float type value.
float-bin.cs
using System;
public class Program{
public static void Main(){
float f = -2.5f;
// BitConverter.The ToInt32 method takes a byte array and the first index to be converted as arguments.
int i = BitConverter.ToInt32( BitConverter.GetBytes( f ), 0 );
Console.WriteLine( "{0:F} ( {1:X8} )", f, i );
//Show a column of bits
for( int j = 31; j >= 0; j-- ){
Console.Write( ( i >> j ) & 1 );
}
Console.WriteLine();
//Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).
Console.WriteLine( "Sign part: {0:X}", ( i >> 31 ) & 1 );
Console.WriteLine( "Index part: {0:X}", ( i >> 23 ) & 0xFF );
Console.WriteLine( "Mantissa: {0:X}", i & 0x7FFFFF );
}
}
float-bin.vb
Module Module1
Sub Main()
Dim f As Single = -2.5f
' BitConverter.The ToInt32 method takes a byte array and the first index to be converted as arguments.
Dim i As Integer = BitConverter.ToInt32( BitConverter.GetBytes( f ), 0 )
Console.WriteLine( "{0:F} ( {1:X8} )", f, i )
'Show a column of bits
Dim j As Integer
For j = 31 To 0 Step -1
Console.Write( ( i >> j ) And 1 )
Next
Console.WriteLine()
'Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).
Console.WriteLine( "Sign part: {0:X}", ( i >> 31 ) And 1 )
Console.WriteLine( "Index part: {0:X}", ( i >> 23 ) And &HFF )
Console.WriteLine( "Mantissa: {0:X}", i And &H7FFFFF )
End Sub
End Module
In Java, ** Float class ** is used. You can get the internal representation of a float type value with the ** Float.floatToIntBits ** method.
float-bin.java
public class Main {
public static void main(String[] args) throws Exception {
float f = -2.5f;
int i = Float.floatToIntBits( f );
System.out.printf( "%f ( %08X )\n", f, i );
//Show a column of bits
for( int j = 31; j >= 0; j-- ){
System.out.printf( "%d", ( i >> j ) & 1 );
}
System.out.println();
//Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).
System.out.printf( "Sign part: %X\n", ( i >> 31 ) & 1 );
System.out.printf( "Index part: %X\n", ( i >> 23 ) & 0xFF );
System.out.printf( "Mantissa: %X\n", i & 0x7FFFFF );
}
}
In C language, C ++, and C #, you can get the internal representation of a float type value by casting ** a float type pointer to an int type pointer **.
float-binptr.c
#include <stdio.h>
int main( void ) {
float f = -2.5f;
int i = *( ( int* )&f ), j;
printf( "%f ( %08X )\n", f, i );
/*Show a column of bits*/
for( j = 31; j >= 0; j-- ){
printf( "%d", ( i >> j ) & 1 );
}
printf( "\n" );
/*Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).*/
printf( "Sign part: %X\n", ( i >> 31 ) & 1 );
printf( "Index part: %X\n", ( i >> 23 ) & 0xFF );
printf( "Mantissa: %X\n", i & 0x7FFFFF );
return 0;
}
In C #, pointers are treated as unsafe code and must be compiled with the "** / unsafe **" option.
float-binptr.cs
using System;
public class Program {
//When dealing with unsafe code, add the "unsafe" keyword.
public unsafe static void Main() {
float f = -2.5f;
int i = *( ( int* )&f );
Console.WriteLine( "{0:F} ( {1:X8} )", f, i );
//Show a column of bits
for( int j = 31; j >= 0; j-- ) {
Console.Write( ( i >> j ) & 1 );
}
Console.WriteLine();
//Extract the exponent part (1 bit), exponent part (8 bits), and mantissa part (23 bits).
Console.WriteLine( "Sign part: {0:X}", ( i >> 31 ) & 1 );
Console.WriteLine( "Index part: {0:X}", ( i >> 23 ) & 0xFF );
Console.WriteLine( "Mantissa: {0:X}", i & 0x7FFFFF );
}
}
This time, I created a program to get the internal representation of float type using some languages as an example. Of course, you can get the internal representation in the same way with double type.
For example, how about using it when you see the calculation of floating point numbers in a computer class?
See you next time!
Recommended Posts