Debugging analysis of the problem that the WiFi of Quanzhi platform cannot connect to the AP

Posted by wednesday on Mon, 28 Feb 2022 11:01:03 +0100

1. Preface

Here we mainly record the problem that WiFi cannot connect to AP, analysis and solutions

2. Problem record

2.1 problem origin and description

Problem origin: the trial production test of the factory production line found that the AP signal was strong but could not be connected without welding the antenna.

Platform information: a10s Android 4 zero point four

wifi module used: rtl8189etv, software version v4 1.5_ seven thousand three hundred and nine point two zero one three zero four two five

Phenomenon description of factory test feedback:

  • The antenna is not welded, and the whole machine is tested. The wifi signal is very strong, but it is difficult to connect to the AP, and the connection time is more than 1 minute;
  • The antenna is welded. The test route is about 10 meters away from the prototype, and the signal strength is almost full, so it is difficult to connect to wifi. Under the same environment, the prototype of A20 scheme can quickly connect to wifi without welded antenna;
  • Connect wifi in the setting, which is shown by repeatedly prompting "authenticating" and "acquiring ip", and it will take a long time to connect;

2.2. problem analysis

This part mainly analyzes the failure of connection from the perspective of software. After gradual analysis, it can be determined that it is a hardware problem. The idea is as follows.

(1) Problem retest. At the beginning, I got the PCBA board with antenna. I did the AP connection test in the office environment. I failed to retest the problem that the signal is good but the AP cannot be connected. I can connect the AP quickly.

-->>There is no recurrence problem in the simple test. If there is a bug in the software, the doubt Turns to the test tool and wifi driver.

(2) The wifi driver confirms the SDK code printed and obtained by the driver. rtl8189etv the driver version used is v4 1.5,v4. The version 1.5 driver has been verified in mass production on A10s. After checking the code, it is found that the patch with slow IP acquisition rate connected to some AP S is not marked.

-->>After patching, the factory test feedback failed to connect the AP, and the problem still exists.

(3) Retest the simulated factory test environment. Get the PCBA board without welded antenna, the Office Seating environment, the router is 1m away from the prototype, the wifi signal is strong, and the problem that the AP cannot be connected can be retested.

-->>It can reproduce the problem of non connection fed back by the factory, and start to analyze from the printed information.

(4) Drive print information analysis. The kernel print captured by the factory is consistent with the kernel print when the short-range non welded antenna cannot connect to the AP. When a connection request is initiated, a request timeout appears, and the log is provided to realtek for synchronous analysis.

-->>Realtek knows whether to use v4 When driving version 1.5, it is proposed to replace the current latest v4 Whether the 3.0.2 version driver verification problem still exists, finish v4 After porting driven by 3.0.2, the AP cannot be connected. The problem still exists, and the log information is also timeout.

-->>Realtek makes an in-depth analysis of the log and comes to the conclusion that TX/RX is not good, resulting in problems in sending and receiving data. PSD and RF related hardware tests should be carried out first.

Unable to connect AP log information:

[ 1327.966991] RTL871X: set ssid [NULL-TEST] fw_state=0x00000008
[ 1327.966997] RTL871X: Set SSID under fw_state=0x00000008
[ 1327.967014] RTL871X: [by_bssid:0][assoc_ssid:NULL-TEST][to_roaming:0] new candidate: NULL-TEST
(90:94:e4:35:9f:e4) rssi:-62  // Routing signal strength
[ 1327.967031] RTL871X: rtw_select_and_join_from_scanned_queue: candidate: NULL-TEST(90:94:e4:35:9f:e4, ch:1)
[ 1327.967063] RTL871X: Capture EPIGRAM_OUI
[ 1327.967068] RTL871X: Capture EPIGRAM_OUI
[ 1327.967074] RTL871X: link to Realtek 96B
[ 1327.967081] RTL871X: rtw_joinbss_cmd: smart_ps=2
[ 1327.967092] RTL871X: set ssid:dot11AuthAlgrthm=2, dot11PrivacyAlgrthm=4, dot118021XGrpPrivacy=2
[ 1327.967101] RTL871X: <=cfg80211_rtw_connect, ret 0
[ 1327.969097] RTL871X: cfg80211_rtw_change_station(wlan0)
[ 1327.969229] RTL871X: cfg80211_rtw_change_station(wlan0)
[ 1327.969309] RTL871X: cfg80211_rtw_change_station(wlan0)
[ 1327.979666] RTL871X: set ch/bw before connected
[ 1327.979697] hw_var_set_bssid   reg=618 
[ 1328.156419] RTL871X: update_mgnt_tx_rate(): rate = 2
[ 1328.209765] RTL871X: Capture EPIGRAM_OUI
[ 1328.213831] RTL871X: Capture EPIGRAM_OUI
[ 1328.217760] RTL871X: link to Realtek 96B
[ 1328.221704] RTL871X: issue_deauth to 90:94:e4:35:9f:e4
[ 1328.226860] RTL871X: start auth
[ 1328.230023] RTL871X: issue_auth
[ 1328.530036] RTL871X: link_timer_hdl: auth timeout and try again
[ 1328.536006] RTL871X: issue_auth
[ 1328.830030] RTL871X: link_timer_hdl: auth timeout and try again
[ 1328.835979] RTL871X: issue_auth
[ 1329.130027] RTL871X: link_timer_hdl: auth timeout and try again
[ 1329.135972] RTL871X: issue_auth
[ 1329.430038] RTL871X: link_timer_hdl: auth timeout and try again
[ 1329.435977] RTL871X: issue_auth
[ 1329.730036] RTL871X: report_join_res(-1)
[ 1329.734110] hw_var_set_bssid   reg=618 
[ 1329.738047] RTL871X: update_mgnt_tx_rate(): rate = 2
[ 1329.743041] RTL871X: _rtw_join_timeout_handler, fw_state=8
[ 1329.748527] RTL871X: rtw_cfg80211_indicate_disconnect(padapter=e0962000)
[ 1329.755246] RTL871X: pwdev->sme_state(b)=1

(5) Grab the air data for analysis. In order to further verify that it is caused by poor TX/RX, grab the air packets in the connection process and analyze the reasons for the failure of AP connection.

-->>The packet capture shows that there is a problem when the prototype initiates the connection. The router replies in time, but the prototype fails to respond. It can be verified that there is a problem with TX/RX. A bad RX will lead to the failure to receive the router's reply normally, and a bad TX will lead to the abnormal data information sent and the failure to parse.

Failed to solve the problem quickly. The software and hardware problems are related to the lack of understanding of the A10s platform,

Whether rtl8189etv the A10s has been successfully mass produced, whether there are similar problems

We don't know the results of software debugging and testing. Problem analysis is to clarify these points.

 

Topics: Linux Android wifi