-1的重要性 -Stratus

别担心，尽管标题是这样的，我还没有恢复到数学书呆子模式。题目不是数字理论，而是套接字编程和一个我经常看到的编码错误。

图1中的代码片段展示了这个错误。有一个main while循环，永远循环调用select并等待接收到可用的字符。一旦select表示有可用的字符，代码就会进入另一个循环，直到收到10个字符。在调用recv函数后，代码正确地检查是否有0返回，这表明远程对等体已经关闭了套接字。然后，它检查errno是否为0，如果为0，则将刚刚收到的字符连入应用缓冲区，并将刚刚收到的字符数加到当前消息的总接收量中。最后，它检查errno是否为值EWOULDBLOCK，如果是其他的值，它就以错误的方式退出程序。此时内循环完成，如果消息中的字符数小于10，则再次调用recv。

 while (1)
   {
   FD_ZERO (&fdsetREAD);
   FD_ZERO (&fdsetNULL);
   FD_SET (sockAccepted, &fdsetREAD);
   iNumFDS = sockAccepted + 1;

/* wait for the start of a message to arrive */
   iSelected = select (iNumFDS,
                      &fdsetREAD, &fdsetNULL, &fdsetNULL, &timevalTimeout);
   if (iSelected < 0) /* Error from select, report and abort */
      {
      perror ("minus1: error from select");
      exit (errno);
      }
/* select indicates something to be read. Since there is only 1 socket
   there is no need to figure out which socket is ready. Note that if
   select returns 0 it just means that it timed out, we will just go around
   the loop again.*/
   else if (iSelected > 0)
        {
        szAppBuffer [0] = 0x00;       /* "zero out" the application buffer */
        iTotalCharsRecv = 0;          /* zero out the total characters count */
        while (iTotalCharsRecv < 10)  /* loop until all 10 characters read */
           {                          /* now read from socket */
           iNumCharsRecv = recv (sockAccepted, szRecvBuffer,
                                   10 - iTotalCharsRecv, 0);
           if (iDebugFlag)            /* debug output show */
              {                       /* value returned from recv and errno */
              printf ("%d  %d     ", iNumCharsRecv, errno);
              if (iNumCharsRecv > 0)  /* also received characters if any */
                 {
                 szRecvBuffer [iNumCharsRecv] = 0x00;
                 printf ("[%s]n", szRecvBuffer);
                 }
              else printf ("n");
              }
           if (iNumCharsRecv == 0)   /* If 0 characters received exit app */
              {
              printf ("minus1: socket closedn");
              exit (0);
              }
           else if (errno == 0)      /* if "no error" accumulate received */
              {                      /* chars into an applictaion buffer */
              szRecvBuffer [iNumCharsRecv] = 0x00;
              strcat (szAppBuffer, szRecvBuffer);
              iTotalCharsRecv = iTotalCharsRecv + iNumCharsRecv;
              szRecvBuffer [0] = 0x00;
              }
           else if (errno != EWOULDBLOCK) /* Ignore an EWOULDBLOCK error */
              {                           /* anything else report and abort */
              perror ("minus1: Error from recv");
              exit (errno);
              }
           if (iDebugFlag) sleep (1); /* this prevents the output from */
           }                          /* scrolling off the window */

        sprintf (szOut, "Message [%s] processedn", szAppBuffer);
        if (iDebugFlag) printf ("%sn", szOut);
        if (send (sockAccepted, szOut , strlen (szOut), 0) < 0)
           perror ("minus1: error from send");
        }
   }

图1----不正确的代码片段

图2显示了一个会话示例。发送到服务器的字符以黄，返回的处理过的消息不会被高亮显示。发送的字符包括一个终止的新行字符，并以1个TCP段发送。当正好发送10个字符时，一切都能正常工作，但当一个TCP段中只发送6个字符时，服务器就会停止响应。但是当一个TCP段中只发送了6个字符时，服务器就会停止响应。

123456789
留言 [123456789
] 处理过的
abcdefghi
留言 [abcdefghi
] 处理过的
12345abcd
留言 [12345abcd
] 处理过的
12345
789
abcdefghi
123456789

图2 - 客户会话

Figure 3 shows the server session with debug turned on. You can see that after the “12345<new line>” characters are received the next recv returns -1 and sets the errno to 5011, which is EWOULDBLOCK. The code then loops and the next recv returns the characters “789<new line>” but the errno value is still set to 5011. In fact every recv after that regardless of whether there are characters received or not has errno set to 5011.

连接接受
10  0     [123456789
]
留言 [123456789
] 处理的

10 0 [abcdefghi
]
留言 [abcdefghi
] 处理的

10 0 [12345abcd
]
留言 [12345abcd
] 处理的

6  0     [12345
]
-1  5011
4  5011     [789
]
-1  5011 
4 5011 [ABCD]
4 5011 [EFG]
4  5011     [i
12]
4  5011     [3456]
4  5011     [789
]
-1  5011
-1  5011
-1  5011

图3 - 服务器调试输出

因为errno值不是0，所以接收到的字符不会被连入应用缓冲区，所以代码永远循环。

这不是套接字代码中的错误。套接字API明确规定errno的值是未定义的，除非函数返回的值是-1，未定义的意思是该值没有被设置，所以errno会保留之前的任何值。

现在你可能会想，没有人会把一个10个字符的消息分成两段，你可能是对的；但是想象一下，消息的长度不是10个字符，而是100或1000个字符。另外请记住，TCP是一个字节流而不是消息流；只要TCP协议栈愿意，它就可以将一个应用消息分割成多个TCP段。某些条件使得这种情况更有可能发生，较长的应用消息，在前一个应用消息被传输之前发送另一个应用消息，以及丢失的TCP段都是很容易想到的。在合适的条件下，这个服务器代码有可能，甚至有可能通过所有的验收测试，并在生产环境中正常运行，至少在一段时间内是这样。

好消息是，有一个非常简单的修复方法；不需要测试errno == 0，只需要测试一个大于0的返回值即可，见图4中高亮的变化。还请注意，"errno != EWOULDBLOCK"测试的注释现在指出，达到if语句的唯一方法是如果recv返回一个负值。它返回的唯一负值是-1。

 while (1)
   {
   FD_ZERO (&fdsetREAD);
   FD_ZERO (&fdsetNULL);
   FD_SET (sockAccepted, &fdsetREAD);
   iNumFDS = sockAccepted + 1;

/* wait for the start of a message to arrive */
   iSelected = select (iNumFDS,
                      &fdsetREAD, &fdsetNULL, &fdsetNULL, &timevalTimeout);
   if (iSelected < 0) /* Error from select, report and abort */
      {
      perror ("minus1: error from select");
      exit (errno);
      }
/* select indicates something to be read. Since there is only 1 socket
   there is no need to figure out which socket is ready. Note that if
   select returns 0 it just means that it timed out, we will just go around
   the loop again.*/
   else if (iSelected > 0)
        {
        szAppBuffer [0] = 0x00;       /* "zero out" the application buffer */
        iTotalCharsRecv = 0;          /* zero out the total characters count */
        while (iTotalCharsRecv < 10)  /* loop until all 10 characters read */
           {                          /* now read from socket */
           iNumCharsRecv = recv (sockAccepted, szRecvBuffer,
                                   10 - iTotalCharsRecv, 0);
           if (iDebugFlag)            /* debug output show */
              {                       /* value returned from recv and errno */
              printf ("%d  %d     ", iNumCharsRecv, errno);
              if (iNumCharsRecv > 0)  /* also received characters if any */
                 {
                 szRecvBuffer [iNumCharsRecv] = 0x00;
                 printf ("[%s]n", szRecvBuffer);
                 }
              else printf ("n");
              }
           if (iNumCharsRecv == 0)   /* If 0 characters received exit app */
              {
              printf ("minus1: socket closedn");
              exit (0);
              }
           else if (iNumCharsRecv > 0) /* if no error accumulate received */
              {                        /* chars into an applictaion buffer */
              szRecvBuffer [iNumCharsRecv] = 0x00;
              strcat (szAppBuffer, szRecvBuffer);
              iTotalCharsRecv = iTotalCharsRecv + iNumCharsRecv;
              szRecvBuffer [0] = 0x00;
              }
           else if (errno != EWOULDBLOCK) /* if we get here iNumCharsRecv */
              {                           /* must be -1 so errno is defined */
              perror                      /* Ignore an EWOULDBLOCK error */
               ("minus1: Error from recv"); /* anything else report */
              exit (errno);               /* and abort */
              }
           if (iDebugFlag) sleep (1); /* this prevents the output from */
           }                          /* scrolling off the window */

        sprintf (szOut, "Message [%s] processedn", szAppBuffer);
        if (iDebugFlag) printf ("%sn", szOut);
        if (send (sockAccepted, szOut , strlen (szOut), 0) < 0)
           perror ("minus1: error from send");
        }
   }

图4----更正后的代码片段

-1的重要性

合作伙伴

主题

快速链接