Opened 7 years ago

Closed 7 years ago

#5064 closed Bug/Something is broken (fixed)

Added apache core-dump on attucks (maybe)

Reported by: Ross Owned by: Ross
Priority: Medium Component: Tech
Keywords: apache2 attucks.mayfirst.org Cc:
Sensitive: no

Description

I added a core-dump to attucks so that the segmentation fault errors that take down the occupyboston.org site should offer more output. However, I'm not sure this will work. I was following this blog

http://www.question-defense.com/2009/07/15/apache-notice-child-pid-27609-exit-signal-segmentation-fault-11

which means I added this line to /etc/apache2/httpd.conf:

CoreDumpDirectory "/tmp/apache-cores"

Not sure if this will have the desired effect, but if it does then I'll need to use gdb to read the dump.

~/ross

Change History (7)

comment:1 Changed 7 years ago by Ross

Nope this did not work :( not sure how to get these core dumps, but it's pretty important given that this is happening repeatedly.

comment:2 Changed 7 years ago by Ross

Resolution: fixed
Status: newclosed

This failed but we figured out the problem wordpress using W3C Caching extension called apc which threw the seg faults.

comment:3 Changed 7 years ago by Daniel Kahn Gillmor

Resolution: fixed
Status: closedassigned

ross managed to get core dumps happening by restarting apache2 after having done

ulimit -c unlimited

once we got a coredump, we tried to look at it with gdb, but got unintelligible stacktraces and the following error message:

  warning: The current binary is a PIE (Position Independent
  Executable), which GDB does NOT currently support.  Most debugger
  features will fail if used in this session.

Looking more closely, we turned up DebianBug:346409 (gdb failures with PIE), which is still relevant in squeeze. I backported gdb 7.3 to squeeze since that appears to work with PIE.

We also installed DebianPackage:apache2-dbg and DebianPackage:php-dbg so that we'd have the relevant debugging info.

The stacktrace from one coredump since then is:

#0  0x00007f73ec9be57d in do_bind_function (opline=0x7f73e17f0390, function_table=0x7f73f1901c10, compile_time=0 '\000') at /tmp/buildd/php5-5.3.3/Zend/zend_compile.c:2956
#1  0x00007f73eca0110c in ZEND_DECLARE_FUNCTION_SPEC_HANDLER (execute_data=0x7f73f1beea28) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:582
#2  0x00007f73eca00ea0 in execute (op_array=0x7f73f2354278) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#3  0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#4  0x00007f73eca0cb5f in ZEND_INCLUDE_OR_EVAL_SPEC_TMP_HANDLER (execute_data=0x7f73f1bee2f8) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:5256
#5  0x00007f73eca00ea0 in execute (op_array=0x7f73f23534c0) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#6  0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#7  0x00007f73eca0783d in ZEND_INCLUDE_OR_EVAL_SPEC_CV_HANDLER (execute_data=0x7f73f1bea0c0) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:22511
#8  0x00007f73eca00ea0 in execute (op_array=0x7f73f1fdc0f0) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#9  0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#10 0x00007f73eca0cb5f in ZEND_INCLUDE_OR_EVAL_SPEC_TMP_HANDLER (execute_data=0x7f73f1be9bc0) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:5256
#11 0x00007f73eca00ea0 in execute (op_array=0x7f73f1fdb7b8) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#12 0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#13 0x00007f73eca0cb5f in ZEND_INCLUDE_OR_EVAL_SPEC_TMP_HANDLER (execute_data=0x7f73f1be9190) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:5256
#14 0x00007f73eca00ea0 in execute (op_array=0x7f73f1fd79d8) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#15 0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#16 0x00007f73eca0cb5f in ZEND_INCLUDE_OR_EVAL_SPEC_TMP_HANDLER (execute_data=0x7f73f1be8f10) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:5256
#17 0x00007f73eca00ea0 in execute (op_array=0x7f73f1fd7138) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#18 0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#19 0x00007f73eca07e53 in ZEND_INCLUDE_OR_EVAL_SPEC_CONST_HANDLER (execute_data=0x7f73f1be8e28) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:1971
#20 0x00007f73eca00ea0 in execute (op_array=0x7f73f1fd6c78) at /tmp/buildd/php5-5.3.3/Zend/zend_vm_execute.h:107
#21 0x00007f73e254d116 in ?? () from /usr/lib/php5/20090626/suhosin.so
#22 0x00007f73ec9d885d in zend_execute_scripts (type=32627, retval=0x7fff75cc2ef0, file_count=3) at /tmp/buildd/php5-5.3.3/Zend/zend.c:1266
#23 0x00007f73ec9840d8 in php_execute_script (primary_file=0x7f73f1901c10) at /tmp/buildd/php5-5.3.3/main/main.c:2288
#24 0x00007f73eca662dd in php_handler (r=0x7f73eca662dd) at /tmp/buildd/php5-5.3.3/sapi/apache2handler/sapi_apache2.c:688
#25 0x00007f73f1227880 in ap_run_handler (r=0x7f73f1bb5ae0) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/config.c:159
#26 0x00007f73f122b21e in ap_invoke_handler (r=0x7f73f1bb5ae0) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/config.c:377
#27 0x00007f73f1238e48 in ap_process_request (r=0x7f73f1bb5ae0) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/modules/http/http_request.c:282
#28 0x00007f73f1235cf8 in ap_process_http_connection (c=0x7f73f1b8e6d0) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/modules/http/http_core.c:190
#29 0x00007f73f122f478 in ap_run_process_connection (c=0x7f73f1b8e6d0) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/connection.c:43
#30 0x00007f73f123de97 in child_main (child_num_arg=<optimized out>) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/mpm/prefork/prefork.c:662
#31 0x00007f73f123e1aa in make_child (s=0x7f73f169a938, slot=18) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/mpm/prefork/prefork.c:758
#32 0x00007f73f123ee34 in perform_idle_server_maintenance (p=<optimized out>) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/mpm/prefork/prefork.c:893
#33 ap_mpm_run (_pconf=<optimized out>, plog=<optimized out>, s=<optimized out>) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/mpm/prefork/prefork.c:1097
#34 0x00007f73f1213a90 in main (argc=3, argv=0x7fff75cc58e8) at /build/buildd-apache2_2.2.16-6+squeeze4-amd64-pXldSC/apache2-2.2.16/server/main.c:742

Digging around in the code and the variables involved, the crash seems to be during wordpress' loading of the stout-google-calendar plugin, when it includes pluggable.php, during the first function definition there (of wp_get_current_user).

The top C frame (apparently at the end of do_bind_function) looks like it has the local variable function set to 0x01 -- if that were dereferenced, it would certainly cause a segfault.

Given that function is not initialized when declared, and that the return value of the first invocation of zend_hash_find is never checked, i'm not surprised to find that bugs might creep in here.

Maybe the next step would be to look at the current state of do_bind_function in the latest version of php, to see if there are bug fixes there? of course, the bug could be somewhere else as well.

also: it's not clear to me how to tell which php modules were loaded from the backtrace itself. clearly suhosin is loaded, but we don't have debugging info available for it at the moment. various people have pointed fingers at APC, but i don't see APC being directly implicated here.

We have also tried to replicate the crash, but haven't seen it happen yet. hopefully we can get another coredump to compare this one to.

comment:4 Changed 7 years ago by Daniel Kahn Gillmor

i wasn't clear where the crash was happening in do_bind_function because gdb was pointing me to the end of the function. Then i realized that the problem was that i hadn't applied the debian-specific patches to my copy of the source.

So in the php5-5.3.3 source directory, i did:

debian/rules patch

which applied the debian-specific patches.

and now looking at the line numbers makes more sense. in particular, we're at the => line:

 ZEND_API int do_bind_function(zend_op *opline, HashTable *function_table, zend_bool compile_time) /* {{{ */
{
    zend_function *function;

    zend_hash_find(function_table, opline->op1.u.constant.value.str.val, opline->op1.u.constant.value.str.len, (void *) &function);
    if (zend_hash_add(function_table, opline->op2.u.constant.value.str.val, opline->op2.u.constant.value.str.len+1, function, sizeof(zend_function), NULL)==FAI\
LURE) {
        int error_level = compile_time ? E_COMPILE_ERROR : E_ERROR;
        zend_function *old_function;

        if (zend_hash_find(function_table, opline->op2.u.constant.value.str.val, opline->op2.u.constant.value.str.len+1, (void *) &old_function)==SUCCESS
            && old_function->type == ZEND_USER_FUNCTION
            && old_function->op_array.last > 0) {
            zend_error(error_level, "Cannot redeclare %s() (previously declared in %s:%d)",
                        function->common.function_name,
                        old_function->op_array.filename,
                        old_function->op_array.opcodes[0].lineno);
        } else {
            zend_error(error_level, "Cannot redeclare %s()", function->common.function_name);
        }
        return FAILURE;
    } else {
=>      (*function->op_array.refcount)++;
        function->op_array.static_variables = NULL; /* NULL out the unbound function */
        return SUCCESS;
    }
}

and function is indeed set to 0x01, so that's the dereference that causes the segfault. I wish we new the return value of the initial zend_hash_find, though :/

and here's a bit of weirdness, also in this same stack frame:

(gdb) print opline->op2.u.constant.value.str.val
$58 = 0x7f73e17f7348 "wp_set_current_user"
(gdb) print opline->op1.u.constant.value.str.val
$59 = 0x7f73e17f72d0 ""
(gdb) print opline->op1.u.constant.value.str.val+1
$60 = 0x7f73e17f72d1 "wp_set_current_user/home/members/occupyboston/sites/web.occupyboston.org/web/wp-includes/pluggable.php0x7f73f11bf34d"
(gdb) print opline->op1.u.constant.value.str.len
$61 = 117
(gdb) print opline->op2.u.constant.value.str.len
$62 = 19
(gdb) 

So it looks like op1's string value starts with a null byte, while op2's value looks reasonable to me. I'm not sure what this implies.

comment:5 Changed 7 years ago by Daniel Kahn Gillmor

using "info shared" within gdb shows that the apc.so module was loaded during the crash, fwiw.

comment:6 Changed 7 years ago by Ross

Resolution: fixed
Status: assignedfeedback

comment:7 Changed 7 years ago by automatic

Status: feedbackclosed

No news is good news (we hope)! Given the lack of feedback, we think this ticket can be closed.

Please login to add comments to this ticket.

Note: See TracTickets for help on using tickets.