A) It requires careful tuning of the initial learning rate
B) It's computation-intensive due to the square root operation
C) It's computation-intensive due to the square root operation
D) It can lead to an excessively decreasing learning rate, hindering convergence