The subject of this article is as follows.
The content is a deep dive into this article. I will summarize again what the library is and what the public API is.
The program starts with the main function and realizes the functions of the program by using the functions called by main and the data used. Since it will not work without the functions to be called from the main function and the data used, there will always be an entity somewhere in the program.
When you create a program in this way, there are cases where you suddenly want to cut corners. You have to use the exact same function every time. The library ** is a mechanism to put together such ** exactly the same functions.
The library associates with a library called ** library link ** or ** link ** when the program is compiled.
In order to link the library, the library must do the following:
-** Make the functions and data required by other programs available to the library **
You have to disclose information about ʻapllo ()` and cups that are used in the figure. This is the main theme of this time, ** Information that is open to the public **. If the function information is resolved by this linking, the program can use the library. At this time, the usage method differs depending on the type of library.
It is a library in a format that the program imports as it is at compile time. Since it is included in the program at build time, the function already exists when the program is executed, so you can use the program without thinking **. Since the entire half-sided library is imported, ** the size will increase **.
At compile time, the program remembers only the library information and associates it with the target file at run time. (It is called ** linking libraries **)
In the case of a shared library, the program only has the library information, so it is necessary to ** make the target library linkable ** when the program executes.
This mechanism is [like this](https://qiita.com/developer-kikikaikai/items/f6f87b2d1d7c3e14fb52#%E5%9F%BA%E6%9C%AC%E7%9A%84%E3% 81% AAlinux% E3% 81% AE% E5% 8B% 95% E7% 9A% 84% E3% 83% A9% E3% 82% A4% E3% 83% 96% E3% 83% A9% E3% 83% AA% E3% 83% AA% E3% 83% B3% E3% 82% AF% E3% 81% AB% E9% 96% A2% E3% 81% 99% E3% 82% 8B% E4% BB% 95% E7% B5% 84% E3% 81% BF).
Simply put, it's OK if the target library path is output by the ldd command. If not, you need to pass the path in the build or environment variable for the library. It is OK if the right side of => is tied somewhere as shown below.
$ ldd /usr/bin/curl
linux-vdso.so.1 (0x00007ffe307ac000)
libcurl.so.4 => /usr/lib/x86_64-linux-gnu/libcurl.so.4 (0x00007f85cce4e000)
...
libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007f85cbb4b000)
libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007f85cb6d3000)
...
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f85c6c95000)
2018/05/16 postscript
~~ I used the article by mistake, and ~~ the official name of the library. I would like to take this opportunity to summarize.
name | Example | Overview |
---|---|---|
Static library | Linux xx.a, Windows xxx.lib | Library in the format incorporated into the program |
Shared library | Linux xx.so, Windows xxx.dll | Library linked at program startup |
Dynamic library | Linux dlopen(), Windows LoadLibrary() | Library to link while the program is running |
xx.a, It feels like a dynamic library because it is fluidly incorporated into various programs. I often make a mistake if I don't organize my brain like "Dynamic ... Oh, it's a name that comes from a different way of being taken in." ... maybe just myself. Yes
I made a long move. From here is the production. I said something, but what is the information that is disclosed to the outside? In program terms, it is ** with external linkage **. (Dare to use the phrase ** public information in this article **)
To put it very roughly in C language, ** the basics are functions and variables ** that are not defined statically. Note that the way of thinking is slightly different between C and C ++. ** If the class is not static, C ++ may not be able to handle internal variables well even if the method is static **. That's right. Others ** It is necessary to understand the difference between static and static which is not public information in C ++ **. For more information on C ++, please refer to here. This article will focus on C.
The problem with this condition alone is that ** All functions used across files in the library become public information! **. So, after that, we will show you how to check public information and how to limit it. It partially overlaps with this article.
You can check the public information with the nm command.
For example:
First, check the program itself.
$nm -D test
w __cxa_finalize
w __gmon_start__
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U __libc_start_main
U memcmp
U __printf_chk
U publisher_free
U publisher_new
U publisher_publish
U publisher_subscribe
U publisher_unsubscribe
U puts
U __stack_chk_fail
Lowercase letters are local functions, details are omitted. The problem is capital letters. ** U is unresolved **, that is, a function that needs to be linked at runtime as a shared library. Since publisher_XXX etc. are self-made functions, they cannot be used unless they are published.
On the other hand, the linked library looks like this
nm .libs/libpublisher.so
...
0000000000000f90 t dputil_list_pop
0000000000000f50 t dputil_list_pull
0000000000000f20 t dputil_list_push
0000000000000ee0 t dputil_lock
0000000000000f00 t dputil_unlock
...
U pthread_mutex_init@@GLIBC_2.2.5
U pthread_mutex_lock@@GLIBC_2.2.5
U pthread_mutex_unlock@@GLIBC_2.2.5
U __pthread_register_cancel@@GLIBC_2.3.3
U __pthread_unregister_cancel@@GLIBC_2.3.3
w __pthread_unwind_next@@GLIBC_2.3.3
...
00000000000009b0 T publisher_free
0000000000202090 b publisher_g
0000000000000a00 T publisher_new
0000000000000b20 T publisher_publish
0000000000000a90 T publisher_subscribe
0000000000000ae0 T publisher_unsubscribe
0000000000000910 t register_tm_clones
U __sigsetjmp@@GLIBC_2.2.5
U __stack_chk_fail@@GLIBC_2.4
0000000000202078 d __TMC_END__
** T is public information **. There is a publisher_xxx properly. In other words, it is OK if libpublisher.so.0.0.0
is ready to be linked at runtime. pthread_mutex_init etc. is ** U **, but since this is a standard function, it will be linked normally.
By the way, it is included in dputil_xxx, libpublisher.so
in the above local function, but it is actually a static library function.
$ nm .libs/libdputil.a
dp_util.o:
00000000000000b0 T dputil_list_pop
0000000000000070 T dputil_list_pull
0000000000000040 T dputil_list_push
0000000000000000 T dputil_lock
0000000000000020 T dputil_unlock
U _GLOBAL_OFFSET_TABLE_
U pthread_mutex_lock
U pthread_mutex_unlock
The fact that this is not ** U ** also means that libpublisher.so took it in at compile time.
You can also dump the information held by the program with the ʻobjdump` command. Since you can check various information other than links, it seems to be useful for analysis if you can master it (= I think that it is close to reading assembler)
objdump -t libpublisher.so.0.0.0
libpublisher.so.0.0.0: file format elf64-x86-64
SYMBOL TABLE:
...
0000000000000ee0 l F .text 0000000000000012 dputil_lock
...
0000000000000000 F *UND* 0000000000000000 free@@GLIBC_2.2.5
...
0000000000000ae0 g F .text 0000000000000032 publisher_unsubscribe
...
If ** g is attached like this, public information **, ** l is local **, ** UND is unresolved **. Please use the through direction for the meaning of detailed words.
Same as the previous article. ** Specify the function to be published in XXX.map and add -Wl, --version-script, libtimelog.map to the build options **.
LDFLAGS+=-Wl,--version-script,libtimelog.map
libtimelog.map
{
global:
timetestlog_init;
timetestlog_store_printf;
timetestlog_exit;
local: *;
};
It's simple and easy to understand, but it's a hassle to create a configuration file.
-fvisibility=hidden
If you specify -fvisibility = hidden
, all functions are first made private.
Then, add __attribute __ ((visibility ("default ")))
to only what you need and publish it! It is a technique called.
It's a perfect way for those who like to control with coding rules.
In libpublisher.so used when introducing nm,
-fvisibility = hidden
specified, for each public function ʻint __attribute __ ((visibility (" default "))) publisher_new (size_t contents_num)`
I specified and built it like this.
As a result, I'm curious that the API of libdputil.a is T, but I can limit it to a nice feeling.
nm -D .libs/libpublisher.so.0.0.0
0000000000202098 B __bss_start
U calloc
w __cxa_finalize
0000000000001210 T dputil_list_pop
00000000000011d0 T dputil_list_pull
00000000000011a0 T dputil_list_push
0000000000001160 T dputil_lock
0000000000001180 T dputil_unlock
0000000000202098 D _edata
00000000002020c0 B _end
0000000000001224 T _fini
U free
w __gmon_start__
0000000000000a00 T _init
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U pthread_mutex_init
U pthread_mutex_lock
U pthread_mutex_unlock
U __pthread_register_cancel
U __pthread_unregister_cancel
w __pthread_unwind_next
0000000000000c10 T publisher_free
0000000000000c80 T publisher_new
0000000000000da0 T publisher_publish
0000000000000d10 T publisher_subscribe
0000000000000d60 T publisher_unsubscribe
U __sigsetjmp
U __stack_chk_fail
I prefer ** --version-script ** because I don't have to be public / private when writing code. It's easy for me to write them all together in conf.
2018/05/20 postscript I also like the fact that this can be automated with a script. I made a script sample that searches the include directory and creates a version-script. The response to macros is not good, but I think it can be used as it is.
#!/bin/sh
output_conf_map() {
#{
# global:
# function_name;
# ...
# local: *;
#}
#only get function list from header file
HEADER_FUNC_LIST=`grep "[a-zA-Z](" -r $1 | grep -v "@brief" | grep -v "#define" | awk -F"(" '{print $1}' | awk -F " " '{print $NF}'`
#template
echo "{"
echo " global:"
#show all function
for data in $HEADER_FUNC_LIST
do
echo " $data;"
done
#template end
echo " local: *;"
echo "};"
}
INCLUDE_LIST=`find . -name include`
for inc_dir in $INCLUDE_LIST
do
echo $inc_dir
output_conf_map $inc_dir
done
First of all, from an apology. The reason for compiling this article is Article on library publication restrictions ) And commented, "Why are there various forms of public API restriction means for shared libraries? What is public in the first place? Why do people live?" It was the beginning. Therefore, at first, I aimed to answer the question accurately in the language of the program area.
However, it was impossible to talk about this area without digging deeply, and since the Zen question and answer really started, I thought again, "What was the library I wanted to organize?" Thanks to that, I feel that the fluffy part, whether I understand it as a middleware upper developer or not, has become clear.
After that, I have a lot of trouble around the library, but if I know how to check it, my anxiety will decrease. Especially around OSS. ldd, nm super convenient
A site that explains how to handle memory very carefully. Kane: "Oh, this is the one I'm addicted to if I dig deeper" When asked for the code size|Things you can't tell at school| [Technical column collection]Built-in gate|Uquest Co., Ltd.
Handling of functions as seen from the assembler A story of being easily defeated before recent technology when trying to show old techniques | Possible Eria
Definition of external linkage https://msdn.microsoft.com/ja-jp/library/k8w8btzz.aspx
How to limit library sharing in C ++ How to write a shared library in C ++ format (gcc edition) --Qiita
How to read nm Display list of symbols from object with nm command --Qiita
Library name [Library-Basic knowledge of communication terms](http://www.wdic.org/w/TECH/ Library)
Recommended Posts