H/FOSS projects can consist of hundreds of thousands of lines of code distributed across hundreds or even thousands of files stored in tens of directories. This kind of scale and structure can make it difficult to find the code that responsible for a particular behavior (e.g. error, bug, warning). This activity walks you through the process of finding, understanding and fixing a readability issue and a bug in the Freeciv project that you built in the Building a FOSS Project Activity.
One important theme in this activity is that it pays to fully understand an issue before making a change. Rushing forward with changes based on a superficial understanding of the code is likely to introduce rather than fix bugs. We'll look at two instances where there seems to be a perfectly reasonable change to be made, but by digging deeper it is discovered that one would have introduced a bug, and one where the first intuition ultimately turns out to be correct but also with deeper understanding leads to more readable code and also prevents a potential future bug.
Prerequisites
Before starting it may be worthwhile to review the find
and grep
commands. Our Unix Tools Challenges Activity and the sites below are good starting points:
Assignment
When building Freeciv, a number of compile warnings are reported. One reports an unused variable in text.c
and the other questionable uses of the logical not operator in the unittype.c
and improvements.c
. This point of this assignment is to work through the process of understanding and fixing the underlying issues that cause these warnings. In doing so you will practice skills and techniques that are applicable to any type of work in a large code base.
Getting Started
Anytime you are going to make changes to a project it is important to keep a clean version of the project, without your changes. This greatly simplifies the process of integrating any changes back into the main project. If we were working with Freeciv in a VCS, like Git or SVN, we would create a feature branch and work there (as we did in the Git/GitHub Workflow Activity). In this case, both because we are not using a VCS and because it makes things easier to understand, we'll just make a copy of the project directory.
freeciv-2.1.9
directory (from the Building a FOSS Project Activity.)
make clean
to remove all compiled files from the project.
freeciv-2.1.9
directory. This is probably your Documents
directory.
freeciv-2.1.9
directory and all of its contents to a new directory freeciv-2.1.9-mod
.
-r
flag to cp
recursively copies a directory and all of its contents.
cp -r dir newDir
Compile Errors and Warnings
If a program contains syntax errors, the compiler will terminate with a compile error. This will cause the make process to terminate with an error as well. In addition to compile errors, the compiler also generates warnings that indicate statements in the program that may not have the intended effect (i.e. they point out potential common coding errors). These are warnings and not errors because the code is syntactically correct and may in fact be what the programmer intends. But, as we'll see, at best a compiler warning usually indicates poorly written code and at worst will highlight a logical error. It is quite common for H/FOSS projects to report lots of compiler warnings when they are built. And it is equally common for these warnings to be ignored - even when they shouldn't be.
make
in the freeciv-2.1.9-mod
directory.
grep
tool, and this is exactly the type of thing that it was created to help with!
make clean
.
make
again, but this time pipe the output through grep
so that only lines with compile warnings (i.e. lines containing the string "warning:") are displayed.
Finding the Responsible Code
The "grepped" output from make shows us the warning messages, the names of the files that contain them and indicates offending line numbers. That's great, but most significant H/FOSS projects have hundreds or thousands of files stored in tens of directories. It could take quite a while to find the problematic files and lines if you are not already familiar with the project. Fortunately, we already know about some tools (find
and grep
) that will be helpful.
In this section we'll find and fix some code readability issues that are highlighted by the compile warnings.
text.c: In function 'popup_info_text': text.c:145:14: warning: variable 'infra' set but not used [-Wunused-but-set-variable] bv_special infra; ^
We can see that the warning is in the file text.c
and is in the popup_info_text
method at column 15 of line 145. The warning tells us that the variable infra
is declared here and somewhere later is set (i.e. assigned a value), but that its value is never actually used. This suggests that both the variable and the assignment are unnecessary.
The [-Wunused-but-set-variable]
indicates that warnings of this type can be turned off by passing the -Wunused-but-set-variable
flag to the compiler. This is an easy-out that would make this type of compile warning go away, but that simply masks the problem, and is usually not the best solution! Let's dig a little deeper.
find
command to determine the directory containing the text.c
file.
text.c
file and find the definition of the infra
variable and where it is assigned to.
gedit filename
will open editor with the file.
popup_info_text
where infra
is assigned to.
You will see that infra
is assigned the result of a function call. This suggests that the result of that function is important and will be used. But we already know that the value assigned to infra
isn't used. Confusing! Right?
At a minimum we can eliminate the variable and the assignment. But we could also, quite reasonably, think about eliminating the function call all together since the return value is unused. That could even lead to some nice performance improvement. But before we rush in, let's dig a little deeper.
We need to look at the function being called. But where is it defined?
grep
to find the definition of the function.
grep -r -n "string"
- searches recursively (-r
) in all subdirectories of the current directory and prints the names (-n
) of the files that contain string
.
-I
flag to grep
causes it to ignore binary files.
string
being searched for with grep
(e.g. "bv_special method_name"
)
grep
command to display the full definition of the function.
-B # -A #
flags to grep
will display the specified number (#) of lines before (-B
) and after (-A
) the line where string
is found.
Notice that both parameters to the function are passed by reference. The first parameter is declared to be const
, which ensures that its value cannot be modified. The second parameter however, is not declared const
, so its value, and because it was passed by reference, the value of the argument provided in the call can be changed by the method. In other words, when called this function can have the side effect of modifying the value of its second argument (this happens on line 76).
So we now know that the function has a side effect, and thus the call to it cannot be eliminated from text.c
. In fact, doing so would likely have introduced a new bug into the program.
We can however, still eliminate the infra
variable and the corresponding assignment. This change will communicate more clearly that the function call is being made not for the return value, but because of its side effect.
Modify the text.c
file by removing the infra
variable and the assignment, leaving the function call. This maintains the proper functionality and makes the code better express the intention.
Run make
. Notice that only the files in the client
directory are compiled and that the compile warning is now gone.
The same compile warning applies to a number of other files as well. Fix at least one more.
Fixing a Bug
Sometimes compiler warnings are innocuous, as in the previous section. The compile warning we addressed turned out to reflect a readability issue. The code was more communicative about its purpose after the change, but the change did not alter the functionality of the program. Other times, compiler warnings can lead to the discovery of bugs. We'll take a look at one of those in this section.
make clean
.
make
piping the results through grep
so that you only see the warning messages.
Several among the first few warning messages include:
improvement.c: In function 'can_player_build_improvement_direct': improvement.c:378:47: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] (!get_player_bonus(p, EFT_ENABLE_SPACE) > 0 ^ unittype.c: In function 'can_player_build_unit_direct': unittype.c:510:48: warning: logical not is only applied to the left hand side of comparison [-Wlogical-not-parentheses] && !get_player_bonus(p, EFT_ENABLE_NUKE) > 0) {
This warning points out occurrences of a common mistake made by C programmers. The logical not (!
) operator has higher precedence than the comparison operators (< > <= >= ==
). Thus, in the statements above, the result of the call to the function get_player_bonus
is negated before the comparison (> 0
) is performed.
If you are not a well versed C programmer, that should give you a moment of pause... The result of a logical not operation will be true or false. How can true or false be > 0? The answer is peculiar to the C language. In C, boolean values are represented as integers with zero being false and any non-zero value being true. So for example, !5 > 0 has the value false (5 is a true value, !5 is a false value (i.e. 0) and 0 > 0 is false). Similarly, !-5 > 0 would also be false. Conversely, !0 > 0 would be true (0 is false so !0 is true, anytime C generates a true value in an expression it uses 1, and 1 > 0 is true).
Ok, so they probably just meant !(get_player_bonus(...) > 0)
or the more readable get_player_bonus(...) <= 0)
. Let's just make the change, submit the fix to the project, and get on with other stuff. But wait a second... Considering that if last time we had gone with our initial thought we would have introduced a bug, maybe we should dig a little deeper.
find
command to locate the unittype.c
file.
unittype.c
file in an editor (e.g. gedit
) and find the code that generated the compile warning.
The offending code is part of the can_player_build_unit_direct
function. This function returns TRUE
(a constant defined to be 1) or FALSE
(an constant defined to be 0), indicating whether or not the player may build a particular type of unit in the game. More specifically, the statement (shown below) seems to be saying that if a unit is nuclear (utype_has_flag(punittype, F_NUCLEAR)
and the player's EFT_ENABLE_NUKE
bonus is not greater than 0, then he/she cannot build the feature (i.e. return FALSE
).
if (utype_has_flag(punittype, F_NUCLEAR) && !get_player_bonus(p, EFT_ENABLE_NUKE) > 0) { return FALSE; }
!get_player_bonus(p, EFT_ENABLE_NUKE) > 0
more carefully to see what this statement is really saying.
x
is of type int
.
!
has higher precedence than >
.
x = get_player_bonus(...); x !x !x > 0 !(x > 0) x <= 0 ------------------------------------- 0 1 >0 <0
From the above truth table you can see that what the code is actually saying is different than what we imagine it should be saying. In particular, this bit of the can_player_build_unit_direct
function will only return FALSE
if the get_player_bonus
is 0. This leaves open the possibility that a player with a negative bonus will be able to build the nuclear unit.
Super! Done! Change it to (get_player_bonus(p, EFT_ENABLE_NUKE) <= 0)
, and off we go... Well... maybe not just yet. Maybe player bonuses can be negative and a player with any non-zero bonus should be able to build the nuclear unit. Let's dig a little deeper and see if there is any evidence in the code that might help clear this up.
In C, there is an unsigned
modifier for numeric data types. For example, an unsigned int
can only hold non-negative values. This is an effective mechanism for enforcing constraints such as disallowing negative player bonuses. Perhaps this has been used.
get_player_bonus
function to see if its return type can hold negative values.
get_player_bonus
calls another function and returns its result. Find the definition of that function and check if its return type can hold negative values.
It turns out that this function sums up a number of values and returns that sum as the bonus (bonus += peffect->value;
). So even though the function return types we found can hold negative values, we now still need to determine if the summed value
s can be negative. To do that we'll need to determine the data type of value
and to do that we'll need to figure out the datatype of peffect
.
peffect
variable is a pointer to a struct
(sort of like a reference to an object in Java). Find the name of the type of struct
that peffect
points to.
peffect
is passed into.
peffect
.
struct
that peffect
points to and determine the data type of the value
field.
struct
begins with struct name
.
grep
Whew... we've reached the bottom and its int
s all the way down. So there is nothing obvious in the code that would enforce player bonuses always being non-negative. But where do the actual values come from? And can they be negative?
If you were to keep digging, (a I did, so you don't have to) eventually you would determine that the values for the effects are stored in data files (data/civ1/effects.ruleset
and data/civ2/effects.ruleset
). Looking in those files shows that some of the effect values are in fact negative.
This would be a really good time to ask a question of the developer community. You've made a good attempt to understand the issue and have enough information to pose a good question that is likely to receive a good response. Since we are working from an old version of Freeciv, we won't actually ask the question. But if you were working on the current development you might write something like:
I was investigating the source of some of the compiler warnings that appear during the make process. In particular, I was looking at these two: improvement.c: In function 'can_player_build_improvement_direct': improvement.c:378:47: warning: logical not is only applied to the left... (!get_player_bonus(p, EFT_ENABLE_SPACE) > 0 ^ unittype.c: In function 'can_player_build_unit_direct': unittype.c:510:48: warning: logical not is only applied to the left... && !get_player_bonus(p, EFT_ENABLE_NUKE) > 0) { As written ( Should a player with a negative bonus be allowed to build? If not, I propose the conditions should be rewritten as Can anyone think of any reason these conditions shouldn't be changed? |
Interestingly, there was a commit that made our proposed fix. Notice that this patch wasn't made until 2015, so this bug hung around for quite a while and was only resolved because someone was fixing compile warnings. But also notice, that it is not clear from the commit message (Cleaned out extra "not" from comparisons of boolean effect values
) that the author of the patch realized that the change was modifying the behavior of the program.
improvement.c
and unittype.c
files.
If you were to go digging around some more in the code you will find quite a number of instances of where get_player_bonus
1) if it returns a negative value. For example:
if (get_player_bonus(pplayer, EFT_NO_DIPLOMACY) || get_player_bonus(pother, EFT_NO_DIPLOMACY)) {
Or:
if (get_player_bonus(pplayer1, EFT_NO_DIPLOMACY) == 0 && get_player_bonus(pplayer2, EFT_NO_DIPLOMACY) == 0) {
And additional places where it is explicitly checked for being positive:
if (get_player_bonus(pplayer, EFT_REVEAL_CITIES) > 0) {
While we are not going to do it, developing a consistent handling of player bonus values as signed integer values, and applying it across the code base, would enhance the future readability and maintainability of the project.
freeciv-2.1.9-mod
(with your changes) and freeciv-2.1.9
directories. You'll be using them again in the next activity.
Acknowledgements: This assignment builds from and adapts ideas and content from the following activities created by others: